Geneva provides a comprehensive job execution framework for distributed feature engineering workflows. This section covers the different types of jobs and execution contexts available in Geneva.
Job Types
Backfilling
Trigger distributed jobs to populate column values in your LanceDB table using UDFs. Learn about filtered backfills and incremental updates.
Materialized Views
Create declarative materialized views to manage batch updates of expensive operations. Optimize data layouts for training and simplify orchestration.
Startup Optimizations
Optimize job and session startup times for faster interactive development and production workflows. Learn about caching, pre-warming, and performance tuning.
Execution Contexts
Execution Contexts
Understand how Geneva automatically packages and deploys your Python execution environment to worker nodes for distributed execution using Ray.
Key Features
- Distributed Processing: Scale feature computation across multiple nodes
- Checkpointing: Resume jobs from failures without losing progress
- Incremental Updates: Only process new or modified data
- Multiple Backends: Support for Ray on Kubernetes and standalone clusters
- Environment Management: Automatic dependency packaging and deployment
Getting Started
- Choose your execution context based on your infrastructure
- Define your UDFs for feature computation
- Trigger backfill operations to populate your data
- Monitor performance and optimize based on usage patterns
For detailed information about each job type and execution context, explore the documentation in this section.