Helix Facts
- What Helix is: Helix is an AI-assisted fund data platform that ingests, normalises, validates and distributes fund data to support operations, regulatory reporting and distribution workflows. ons
- Inputs supported today: CSV, XLSX, PDF (text-based factsheets and data sheets).
- Outputs supported today: Openfunds v2.1 share class data, FinDatEx EMT 4.2, FinDatEx EPT 2.1. PRIIPs production workflows are included.
- Governance controls: Validation rules applied at ingestion, field-level confidence signals, transformation audit trail, and role-based access controls.
- Integration posture: API and export-driven. Integration-ready for connection to downstream systems via file exports, structured outputs, or workflow and orchestration layers.
- Commercial posture: Pilot available. No minimum commitment. No onboarding fees. Month-to-month pricing.
What is Helix and who is it for?
Helix is a fund data platform built to give asset managers, fund administrators and operations teams a single, clean, validated view of their fund data across every output they need to produce.
- The mission is to become the single pane of glass for firm data: one trusted data layer that feeds operations, regulatory submissions, distributor templates, marketing and client communications.
- DataHub is the first product in that direction, focused on ingestion, normalisation and distribution of share class and fund data.
- Designed for fund operations teams, compliance officers, data managers and distribution teams at asset managers and fund administrators of any size.
- No minimum fund count and no large implementation project required. You can upload your first file and see mapped outputs within minutes of registering.
- Built to sit alongside existing tools and workflows rather than requiring a full cutover from day one.
What is DataHub?
DataHub is Helix's ingestion and normalisation layer: it takes raw fund data files in any common format, maps the fields to a validated internal schema, and produces clean, standards-compliant outputs ready for distribution or downstream systems.
- Ingest: Accepts CSV, XLSX and PDF (factsheet) files. Multiple files can be uploaded in a single session.
- Normalise: Maps source column headers and values to a consistent internal schema using a rules-first engine with AI assistance for ambiguous or non-standard headers.
- Validate: Applies required-field checks, type checks, cross-field consistency rules, and template-specific rules before export is permitted.
- Export: Produces Openfunds, EMT and EPT outputs in CSV or XLSX format, ready to send to platforms, distributors or internal teams.
- Reduces manual spreadsheet stitching, copy-paste errors and distributor rejections caused by inconsistent field formatting.
- Provides a field-level audit trail so every transformation decision is traceable and reviewable.
What inputs can Helix ingest?
Helix accepts structured and semi-structured fund data files in three formats today.
- CSV: Flat comma-separated files. Column headers are matched to schema fields automatically. Works well for administrator exports and internal data feeds.
- XLSX (Excel): Single or multi-sheet workbooks. Helix extracts the primary data sheet and normalises headers and values in the same way as CSV.
- PDF (text-based factsheets): Helix extracts tables and key identifiers from text-based PDFs. Best results come from clearly labelled fields near visible identifiers such as ISIN and currency.
- Strongly recommended identifiers: ISIN, full share class name, and share class currency. These are required or strongly recommended for most output templates.
- Scanned or image-only PDFs are accepted but produce lower extraction quality. Where possible, use a text-based PDF or re-upload the data as CSV or XLSX.
- Multiple files can be uploaded in a single session, allowing batch ingestion of a full fund range at once.
What outputs can Helix generate?
Helix produces three types of standards-compliant output today, all versioned and aligned to published industry specifications.
- Openfunds v2.1: Share class master data in the Openfunds standard. Covers share class attributes, fees, lifecycle dates, and distribution data. Used for exchange between fund managers, administrators and platforms.
- FinDatEx EMT 4.2 (European MiFID Template): Covers investor profile, target market and cost data required for MiFID II distribution. Required by most European and UK distributors.
- FinDatEx EPT 2.1 (European PRIIPs Template): Covers performance scenarios, PRIIPs cost calculations and transaction cost data. Required for PRIIPs-in-scope products distributed in the EU and UK.
- All exports are available in CSV or XLSX format.
- Outputs are validated before export. A file that fails required-field or consistency checks cannot be exported until the issue is resolved.
- PRIIPs production workflows are supported. Contact us for current scope and availability.
How does mapping work?
Helix uses a rules-first mapping approach, applying deterministic matching before falling back to AI-assisted analysis for ambiguous cases.
- Exact match: Column headers that exactly match schema field names or known aliases are mapped automatically with 100% confidence.
- Pattern match: Common variations, abbreviations and regional naming conventions (e.g. "ccy", "currency", "share_ccy") are matched via a pre-built rule set at 90% confidence.
- AI-assisted match: Where rule-based matching does not resolve a header, a language model analyses the column name and sample values to suggest the most likely schema field. AI-assisted matches are capped at a defined confidence ceiling and flagged for review.
- Exception handling: Unresolved headers are surfaced in the audit trail with a prompt to manually confirm or reassign the mapping before export.
- ISIN and other unique identifiers are given priority weighting in the matching logic to reduce the risk of identifier mis-mapping.
- Manual mapping overrides can be applied in the interface. All overrides are logged in the audit trail.
How do you prevent hallucinations and errors?
Helix is built on a deterministic-first architecture: rules and validation logic are applied before and after any AI-assisted step, and outputs are never produced without passing validation.
- The mapping engine applies deterministic rules first. AI is only invoked for fields that rules cannot resolve.
- AI-assisted matches are assigned a confidence score and flagged for human review. They do not silently override rule-based results.
- All mapped fields pass through a validation layer before export. Required fields, type constraints and cross-field rules must pass before a clean export can be generated.
- The audit trail records every mapping decision, its source (rule-based, AI-assisted, or manual), and its confidence level, giving reviewers a full picture of how each field was resolved.
- Conservative defaults are used throughout: where a value is ambiguous or missing, the system flags it rather than filling it silently.
What validations are applied?
Helix applies layered validation at ingestion, after mapping, and again before each export is generated.
- Required field checks: Each output template has a defined set of required fields. Rows missing required fields are flagged and cannot be exported in that template until resolved.
- Type and format checks: Date fields, currency codes (ISO 4217), numeric fields and ISIN format (12-character alphanumeric) are validated against expected patterns.
- Consistency checks: Where fields should be consistent across a share class (e.g. domicile, fund group), contradictions between rows are surfaced.
- Cross-field rules: Template-specific logic is applied. For example, EMT requires consistent currency across investor profile fields; EPT requires performance identifiers.
- Row-level rejection messages: Every failed row returns a specific field-level reason, not a generic error. This allows targeted correction without re-processing the whole file.
- Validation results are included in the audit trail and visible in the review interface before export.
What does the audit trail include?
Every transformation is logged so that reviewers, compliance teams and operations managers can trace how each output was produced from its source.
- Source file reference: The original filename, upload timestamp and the user who uploaded it.
- Mapping decisions: Each field mapping is recorded with the method used (exact match, pattern match, AI-assisted, or manual override) and the confidence score assigned.
- Transformations applied: Any format conversions, type coercions or value normalisations are logged against the field and row they affected.
- Validation outcomes: Pass, fail or warning status per field and per row, with specific reasons for each outcome.
- User actions: Manual overrides, mapping confirmations and export events are attributed to the user who performed them.
- Timestamps: All events in the pipeline carry UTC timestamps for audit and compliance purposes.
How does Helix fit into our operating model?
Helix is designed to be adopted incrementally, running in parallel with your existing process before replacing it, rather than requiring a full cutover from day one.
- Parallel run: Start by uploading one fund's data and comparing Helix outputs against your existing templates. Validate before expanding scope.
- Incremental expansion: Once validated, onboard additional funds or templates. There is no hard limit on fund count or file volume.
- Role-based access: Different team members can be assigned different roles, for example upload access, review access and export approval.
- Sign-off workflow: Outputs can be reviewed and approved in the interface before being exported, supporting existing sign-off and governance processes.
- Helix does not require you to decommission existing systems as a precondition. It sits alongside your current stack and feeds it clean, validated data.
- For teams running manual EMT or EPT production in spreadsheets, Helix replaces that step without requiring changes to downstream processes or recipient systems.
Integrations and automation
Helix is designed to integrate with your existing technology stack via exports, API endpoints, and workflow and orchestration layers.
- File exports: All outputs are available as CSV or XLSX downloads, compatible with any downstream system that accepts structured files.
- API access: Helix exposes API endpoints for programmatic ingestion and export retrieval. API documentation is available on request.
- Workflow and orchestration layers: Helix is integration-ready for connection to workflow and orchestration tools that manage broader data pipelines and operational processes.
- Downstream systems: Clean, validated outputs are designed for onward delivery to distributor platforms, data warehouses, CRM systems and marketing tools.
- Automated or scheduled ingestion pipelines can be discussed as part of onboarding for teams with recurring data feeds.
- Speak to us about your specific stack and integration requirements: ludovic.milne@helix-ai.co.uk
Security and data handling
Helix processes data securely in transit and at rest, with access controls applied at user and role level.
- All data is transmitted over HTTPS. Data at rest is stored on secure cloud infrastructure.
- Access is controlled at user level. Role-based permissions limit what each user can upload, review or export.
- Uploaded files and generated exports can be deleted from the platform on request.
- Helix does not sell or share your data with third parties.
- For specific security questionnaires, compliance requirements or data processing agreements, contact us directly.
- Support: ludovic.milne@helix-ai.co.uk | Book a call: calendly.com/ludovic-milne-helix-ai
Commercials and onboarding
Helix is designed to be easy to start, easy to expand and easy to exit if it is not right for you.
- Free trial: Register at app.helix-ai.co.uk/register and upload your first file immediately. No credit card required to start.
- No onboarding fees: There is no implementation charge, setup fee or professional services requirement to get started.
- No minimums: No minimum fund count, minimum file volume or minimum contract value.
- Month-to-month: Pricing scales with usage. You are not locked into a long-term commitment.
- Typical pilot flow: (1) Register and upload a sample file. (2) Review mapped outputs. (3) Compare against your current templates. (4) Confirm scope and expand to your full fund range.
- For larger teams or firms with specific configuration requirements, we offer a guided onboarding session. Book via Calendly.
What Helix is not
To set the right expectations, here is what Helix does not do today.
- Not a portfolio management system: Helix does not manage portfolios, holdings or transactions. It ingests and normalises data about funds and share classes.
- Not a performance calculation engine: Helix does not calculate NAV, benchmark-relative returns or risk metrics. It processes and validates data that includes those values where they are provided in source files.
- Not a data vendor: Helix does not supply market data, pricing or reference data. It works with the data you bring to it.
- Not a CMS or document management system: Helix produces structured data outputs. It is not a system of record for documents or factsheets in their original form.
What Helix does instead: ingest your fund data from any format, normalise it to industry standards, validate it against regulatory and template requirements, and distribute it to the teams, systems and recipients that need it.
Frequently asked questions
Does Helix replace our data warehouse?
- No. Helix is an ingestion and normalisation layer that sits upstream of your data warehouse.
- It is designed to feed clean, validated data into your existing downstream systems, not replace them.
- Integration with data warehouses is supported via file exports or API.
Can we run this alongside our current process?
- Yes. The recommended approach is a parallel run: use Helix to produce outputs alongside your existing process, compare them, and expand when you are satisfied.
- Helix does not require any changes to your existing systems or recipient processes.
- Onboarding a single fund or template first is supported and encouraged.
Do you support distributor-specific EMT variations?
- Helix generates EMT 4.2 outputs aligned to the FinDatEx published standard.
- Column reordering, custom naming conventions or additional fields required by specific distributors are available on request.
- Contact us with your distributor's template requirements.
What if my PDF is scanned?
- Scanned PDFs (image-only files) produce lower extraction quality than text-based PDFs.
- For scanned files, we recommend re-supplying the data as CSV or XLSX for reliable results.
- Text-based PDFs with clearly labelled fields and visible identifiers (ISIN, currency) produce the best extraction results.
Can Helix output data to our CRM or marketing tools?
- Helix exports clean, structured data that is integration-ready for connection to CRM, marketing automation and content platforms.
- Connections to specific tools are supported via file exports, API endpoints, or workflow and orchestration layers.
- Speak to us about your specific downstream use case.
Do you support PRIIPs KID production?
- Helix includes PRIIPs production workflows.
- Contact us for current scope and availability: ludovic.milne@helix-ai.co.uk
How do we handle missing fields?
- Missing required fields are flagged at row level in the audit trail with specific field names and reasons.
- You can correct the source file and re-upload, or set defaults for specific fields where the platform allows.
- Optional fields that are missing are noted in the audit trail but do not block export unless required by the target template.
What is the difference between EMT, EPT and Openfunds?
- EMT (European MiFID Template): investor profile, target market and cost data for distribution. Required by most European and UK distributors.
- EPT (European PRIIPs Template): performance scenarios, PRIIPs cost calculations and transaction cost data for PRIIPs-in-scope products.
- Openfunds v2.1: share class master data standard used for data exchange between fund managers, administrators and platforms.
How do I fix unrecognized headers?
- Rename the column to a recognised alias such as "isin", "share_class_currency", "ccy", or "fund_name".
- Alternatively, the mapping interface will prompt you to manually confirm or reassign the header before processing continues.
- All corrections are logged in the audit trail against the relevant field.
Is there a free trial? What does it include?
- Yes. Register at app.helix-ai.co.uk/register and start immediately with no credit card or contract required.
- The trial gives access to the full DataHub ingestion and export workflow.
- No onboarding fees, no minimum commitment and no automatic charges.
Which fields are mandatory for each template?
- Core static data: ISIN, Full Share Class Name, Share Class Currency.
- EMT 4.2: Adds investor profile and cost fields. Currency consistency is required across investor profile fields.
- EPT 2.1: Adds performance scenario fields and PRIIPs cost data. Identifiers (ISIN) are required.
- Openfunds v2.1: Share class attributes including lifecycle dates and distribution data, following the Openfunds published specification.
Can I upload multiple funds or share classes at once?
- Yes. Multiple files can be uploaded in a single session.
- Each file is processed independently and results are combined in the audit view.
- There is no hard limit on the number of share classes per file or per session.
Does Helix learn from our corrections over time?
- Mapping overrides and confirmed corrections are logged in the audit trail.
- The system is designed to improve matching for recurring patterns in your data over time.
- Contact us for details on how mapping memory and correction workflows operate in your account.
What does a typical onboarding look like?
- Register, upload a representative sample file (one or two share classes), and review the mapped output. This typically takes under 30 minutes.
- Compare the Helix output against your current EMT, EPT or Openfunds file to validate accuracy.
- Expand to your full fund range once satisfied. No implementation work is required from your side.
- For complex configurations or large fund ranges, book a guided session via Calendly.