The main purpose of HTTP transport is to reduce massive fetching overhead associated with large data sets (1M+ rows). It uses native Exasol commands EXPORT and IMPORT specifically designed to move large amounts of data. Data is transferred using CSV format with optional compression.
This is a powerful tool which helps to bypass creation of Python objects altogether and dramatically increase performance.
PyEXASOL offloads HTTP communication and decompression to separate process using multiprocessing module. Main process only reads or writes to pipe opened in binary mode.
Specify custom callback function to read or write from pipe and to apply complex logic. Use callback_params to pass additional parameters to callback function (e.g. options for pandas).
You may also specify import_params or export_params to alter IMPORT or EXPORT query and modify data stream.
Export data from Exasol into pandas.DataFrame. You may use callback_param argument to pass custom options for pandas read_csv function.
# Read from SQL
pd = C.export_to_pandas("SELECT * FROM users")
# Read from table
pd = C.export_to_pandas("users")Import data from pandas.DataFrame into Exasol table. You may use callback_param argument to pass custom options for pandas to_csv function.
C.import_from_pandas(pd, "users")my_list = [
(1, 'Bob', False, '2018-01-01'),
(2, 'Gill', True, '2018-02-01'),
]
C.import_from_iterable(my_list, "users")This function is suitable for very big INSERTS as long as generator returns rows 1-by-1 and does not run out of memory.
def my_generator():
for i in range(5):
yield (i, 'Bob', True, '2017-01-01')
C.import_from_iterable(my_generator, "users")Import data from file, path object or file-like object opened in binary mode. You may import from process STDIN using sys.stdin.buffer.
# Import from file defined with string path
C.import_from_file('/test/my_file.csv', "users")
# Import from path object
C.import_from_file(pathlib.Path('/test/my_file.csv'), "users")
# Import from opened file
file = open('/test/my_file.csv', 'rb')
C.import_from_file(file, "users")
file.close()
# Import from STDIN
C.import_from_file(sys.stdin.buffer, "users")Export data from Exasol into file, path object or file-like object opened in binary mode. You may export to process STDOUT using sys.stdout.buffer.
# Export from file defined with string path
C.export_to_file('my_file.csv', "users")
# Export into STDOUT
C.export_to_file(sys.stdout.buffer, "users")Please refer to Exasol User Manual to know more about IMPORT / EXPORT parameters.
| Name | Example | Description |
|---|---|---|
column_separator |
, |
Column separator for CSV |
column_delimiter |
" |
Column delimiter for CSV (quotting) |
columns |
['id', 'name'] |
List of table columns in data source, useful if column order of data source does not match column order of Exasol table |
row_separator |
LF |
Row separator for CSV (line-endings) |
encoding |
UTF8 |
File encoding |
with_column_names |
True |
Add column names as first line, useful for Pandas |
null |
\N |
Custom NULL value |
delimit |
AUTO |
Delimiter mode: AUTO, ALWAYS, NONE |
format |
gz |
Import file or stream compressed with gz, bzip2, zip |
| Name | Example | Description |
|---|---|---|
column_separator |
, |
Column separator for CSV |
column_delimiter |
" |
Column delimiter for CSV (quotting) |
columns |
['id', 'name'] |
List of table columns, useful to reorder table columns during export from table |
row_separator |
LF |
Row separator for CSV (line-endings) |
encoding |
UTF8 |
File encoding |
skip |
1 |
How many first rows to skip, useful for skipping header |
null |
\N |
Custom NULL value |
trim |
TRIM |
Trim mode: TRIM, RTRIM, LTRIM |
format |
gz |
Export file or stream compressed with gz, bzip2, zip |
Full collection of pre-defined callback functions is available in callback.py module.
Example of callback exporting into basic Python list.
# Define callback function
def export_to_list(pipe, dst, **kwargs):
wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
reader = csv.reader(wrapped_pipe, lineterminator='\n', **kwargs)
return [row for row in reader]
# Run EXPORT using defined callback function
C.export_to_callback(export_to_list, None, 'my_table')Example of callback importing from Pandas into Exasol.
df = <pandas.DataFrame>
def import_from_pandas(pipe, src, **kwargs):
wrapped_pipe = io.TextIOWrapper(pipe, newline='\n')
return src.to_csv(wrapped_pipe, header=False, index=False, quoting=csv.QUOTE_NONNUMERIC, **kwargs)
# Run IMPORT using defined callback function
C.export_from_callback(import_from_pandas, df, 'my_table')