Asset Management Knowledge Graph

A full-stack knowledge graph application that models the relationships between investment portfolios, their holdings, benchmarks, sectors, and ESG ratings — then lets you explore them visually and query them in natural language.

Traditional portfolio analytics lives in spreadsheets and relational databases where answering “which portfolios share high-risk ESG assets?” requires complex multi-table joins across 5+ tables. A knowledge graph makes these relationship questions natural — you just follow the edges. This project builds that graph from real ETF data and wraps it in an interactive frontend.

Graph Schema

PortfolioHOLDSAssetBELONGS_TOSector
PortfolioTRACKSBenchmarkCOMPOSED_OFAsset
AssetHAS_ESG_SCOREESGRating

How It Works

1

Fetch

Downloads holdings CSVs from 8 iShares UCITS ETFs, loads Kaggle ESG ratings, enriches with Yahoo Finance market data.

2

Transform

Parses iShares CSV format, normalizes ESG scores to 0-10 scale, maps countries to ISO codes, generates sector-based ESG for coverage gaps.

3

Validate

Quality checks on weight sums, ISIN formats, duplicates. Pipeline halts if checks fail.

4

Load

Batched idempotent MERGE into Neo4j with uniqueness constraints and indexes.

What You Can Do

Tech Stack

Backend

Python 3.10+, FastAPI, Neo4j 5, Pydantic v2, Pandas, Anthropic Claude API, yfinance

Frontend

Next.js 16 (App Router), React 19, TypeScript, Tailwind CSS v4, react-force-graph-2d

Infrastructure

Docker Compose (Neo4j), pip-installable package via hatchling

Semantic Standards

OWL Ontology

The graph schema is formally defined as an OWL/RDF ontology in Turtle format. Each node type is declared as an owl:Class, relationships as owl:ObjectProperty with typed domains and ranges, and attributes as owl:DatatypeProperty with XSD types. Available at /ontology.

SKOS Vocabularies

Sector and asset class taxonomies follow SKOS (Simple Knowledge Organization System). The GICS sector hierarchy is modelled as a skos:ConceptScheme with preferred and alternative labels, supporting harmonization of variant names across data sources. Available at /ontology.

Data Lineage

Every node and relationship carries provenance metadata: data source tag, ingestion timestamp, and pipeline run ID. This enables traceability from any graph entity back to its origin — whether iShares CSV, Kaggle ESG data, or yfinance enrichment. Query via /ontology.

The Data

8 real iShares UCITS ETFs covering European, global, ESG-screened, and fixed income strategies. ~2,000 unique assets, 11 sectors, ESG ratings from Kaggle + sector-based generation. All relationship properties carry weights, dates, and scores — not just binary connections.