Components
The platform is divided into various components to keep it modular and enable easy scaling for cloud-based deployments and also to manage high loads without much change. Each of the components can be installed on separate machines or any subset can be installed in the same machine.
The components are divided into:
- Web Application Server: The web application server for the analytical UI of the platform
- API - Celery worker: The worker for asynchronous API tasks
- Spark - Celery worker: The worker for asynchronous spark tasks
- Jupyter Notebook: The Jupyter Notebook server for free-form analytical use
- File Management: The file management server to manage files
- Metadata Database (SQL RDBMS): The database with all metadata provided in the Web Application
- Messaging Queue (Redis): The messaging queue to orchestrate worker tasks
- Authentication Provider: The identity and auth provider for access and permissions
- Proxy / Load Balancers: Load Balancers / Proxies to simplify the install