RxSpark
revoscalepy.RxSpark(hdfs_share_dir: str = '/user/RevoShare\766RR78ROCWFDMK$',
share_dir: str = '/var/RevoShare\766RR78ROCWFDMK$',
user: str = '766RR78ROCWFDMK$', name_node: str = None,
master: str = 'yarn', port: int = None,
auto_cleanup: bool = True, console_output: bool = False,
packages_to_load: list = None, idle_timeout: int = 3600,
num_executors: int = None, executor_cores: int = None,
executor_mem: str = None, driver_mem: str = None,
executor_overhead_mem: str = None, extra_spark_config: str = '',
spark_reduce_method: str = 'auto', tmp_fs_work_dir: str = None,
persistent_run: bool = False, wait: bool = True, **kwargs)
Description
Creates the compute context for running revoscalepy analysis on Spark. Note that the use of rx_spark_connect() is recommended over RxSpark() as rx_spark_connect() supports persistent mode with in-memory caching. Run help(revoscalepy.rx_spark_connect) for more information.