|
|
@@ -4,23 +4,34 @@ This project implements an abstraction of objects that can have access to a vari
|
|
|
|
|
|
# Why Use Data-Transport ?
|
|
|
|
|
|
-Data transport is a simple framework that:
|
|
|
-- easy to install & modify (open-source)
|
|
|
-- enables access to multiple database technologies (pandas, SQLAlchemy)
|
|
|
-- enables notebook sharing without exposing database credential.
|
|
|
-- supports pre/post processing specifications (pipeline)
|
|
|
+Data transport is a simple framework that enables read/write to multiple databases or technologies that can hold data. In using **data-transport**, you are able to:
|
|
|
|
|
|
+- Enjoy the simplicity of **data-transport** because it leverages SQLAlchemy & Pandas data-frames.
|
|
|
+- Share notebooks and code without having to disclosing database credentials.
|
|
|
+- Seamlessly and consistently access to multiple database technologies at no cost
|
|
|
+- No need to worry about accidental writes to a database leading to inconsistent data
|
|
|
+- Implement consistent pre and post processing as a pipeline i.e aggregation of functions
|
|
|
+- **data-transport** is open-source under MIT License https://github.com/lnyemba/data-transport
|
|
|
|
|
|
## Installation
|
|
|
|
|
|
-Within the virtual environment perform the following :
|
|
|
+Within the virtual environment perform the following, the options for installation are:
|
|
|
|
|
|
- pip install git+https://github.com/lnyemba/data-transport.git
|
|
|
+**sql** - by default postgresql, mysql, sqlserver, sqlite3+, duckdb
|
|
|
|
|
|
-Options to install components in square brackets
|
|
|
+ pip install data-transport[cloud,nosql,other,all]git+https://github.com/lnyemba/data-transport.git
|
|
|
|
|
|
- pip install data-transport[nosql,cloud,warehouse,all]@git+https://github.com/lnyemba/data-transport.git
|
|
|
+Options to install components in square brackets, these components are
|
|
|
+
|
|
|
+**warehouse** - Apache Iceberg, Apache Drill
|
|
|
+
|
|
|
+**cloud** - to support nextcloud, s3
|
|
|
+
|
|
|
+**nosql** - support for mongodb, couchdb
|
|
|
|
|
|
+**other** - support for files, rabbitmq, http
|
|
|
+
|
|
|
+ pip install data-transport[nosql,cloud,warehouse,all]@git+https://github.com/lnyemba/data-transport.git
|
|
|
|
|
|
## Additional features
|
|
|
|
|
|
@@ -28,7 +39,6 @@ Options to install components in square brackets
|
|
|
- CLI interface to add to registry, run ETL
|
|
|
- scales and integrates into shared environments like apache zeppelin; jupyterhub; SageMaker; ...
|
|
|
|
|
|
-
|
|
|
## Learn More
|
|
|
|
|
|
We have available notebooks with sample code to read/write against mongodb, couchdb, Netezza, PostgreSQL, Google Bigquery, Databricks, Microsoft SQL Server, MySQL ... Visit [data-transport homepage](https://healthcareio.the-phi.com/data-transport)
|