Skip to content

sift_py.data_import.parquet

CLASS DESCRIPTION
ParquetUploadService

ParquetUploadService

ParquetUploadService(rest_conf: SiftRestConfig)

Bases: _RestService

METHOD DESCRIPTION
flat_dataset_upload

Uploads the Parquet file pointed to by path to the specified asset. This function will

upload

Uploads the Parquet file pointed to by path using a custom Parquet config.

upload_from_url

Uploads the Parquet file pointed to by url using a custom Parquet config.

ATTRIBUTE DESCRIPTION
DETECT_CONFIG_PATH

UPLOAD_PATH

URL_PATH

DETECT_CONFIG_PATH class-attribute instance-attribute

DETECT_CONFIG_PATH = '/api/v0/data-imports:detect-config'

UPLOAD_PATH class-attribute instance-attribute

UPLOAD_PATH = '/api/v1/data-imports:upload'

URL_PATH class-attribute instance-attribute

URL_PATH = '/api/v1/data-imports:url'

flat_dataset_upload

flat_dataset_upload(
    asset_name: str,
    path: Union[str, Path],
    time_path: str,
    time_format: TimeFormatType = ABSOLUTE_UNIX_NANOSECONDS,
    complex_types_import_mode: ParquetComplexTypesImportModeType = BOTH,
    run_name: Optional[str] = None,
    run_id: Optional[str] = None,
    relative_start_time: Optional[str] = None,
) -> DataImportService

Uploads the Parquet file pointed to by path to the specified asset. This function will automatically generate the Parquet Config using the footer. See the options below for what parameters can be overridden. Use upload if you need to specify a custom Parquet config.

Set time_path to specify which column contains timestamp information and time_format to specify the time data format. Default is TimeFormatType.ABSOLUTE_UNIX_NANOSECONDS.

Override complex_types_import_mode to specify how to import complex types (maps and list). Default is both strings and bytes. Override run_name to specify the name of the run to create for this data. Default is None. Override run_id to specify the id of the run to add this data to. Default is None. Override relative_start_time if a relative time format is used. Default is None.

upload

upload(
    path: Union[str, Path],
    parquet_config: ParquetConfig,
    show_progress: bool = True,
) -> DataImportService

Uploads the Parquet file pointed to by path using a custom Parquet config.

PARAMETER DESCRIPTION
path

The path to the Parquet file.

TYPE: Union[str, Path]

parquet_config

The Parquet config.

TYPE: ParquetConfig

show_progress

Whether to show the status bar or not.

TYPE: bool DEFAULT: True

upload_from_url

upload_from_url(
    url: str, parquet_config: ParquetConfig
) -> DataImportService

Uploads the Parquet file pointed to by url using a custom Parquet config.