The initial attempt to build our embeddable RAG (Retrieval-Augmented Generation) engine within a Java desktop application relied on a single SQLite database. The thinking was straightforward: keep the deployment simple, a single file on disk. We stored document metadata in standard relational tables, the raw text chunks in a TEXT
column, and tried to manage vector embeddings in a BLOB
column. This approach failed catastrophically as soon as we loaded the first few thousand documents. Relational queries on metadata became sluggish due to page bloating from the large text and blob fields, and performing any kind of vector similarity search required loading the entire vector set into memory, which was a non-starter. A fundamental architectural rethink was necessary, moving away from a one-size-fits-all database to a polyglot persistence model tailored to the specific access patterns of a RAG workload.
Our revised architecture identifies three distinct types of data, each with its own optimal storage strategy:
- Structured Metadata & Application State: Document sources, ingestion timestamps, user query history, and processing flags. This is classic relational data, requiring transactional integrity and complex querying capabilities. SQLite managed via JPA/Hibernate remains the best-in-class choice for this in an embeddable context.
- Vector Embeddings: High-dimensional floating-point vectors requiring specialized indexing (like HNSW) for fast approximate nearest neighbor search. A dedicated vector database is non-negotiable. We selected ChromaDB for its simplicity and ability to be run as a local, self-contained server, aligning with our “embeddable” ethos.
- Raw Document Chunks: The actual text content corresponding to each vector. After a vector search identifies relevant document IDs, we need to retrieve this text with minimal latency. Storing it in SQLite was inefficient. A simple, high-throughput key-value store is the perfect tool for this job. LevelDB, with its Log-Structured Merge-Tree design, offers excellent write performance and fast key-based reads, making it ideal for this retrieval pattern.
This tiered approach forms a hybrid storage engine. The central challenge becomes orchestrating these three disparate systems to maintain data consistency and provide a unified data access layer for the application.
Orchestrating the Polyglot Persistence Layer
The core of the implementation is a HybridRAGDataService
that encapsulates all interactions with the three underlying data stores. Its primary responsibility is to ensure that an operation, like adding a new document, is executed correctly across all tiers. In a real-world project, the lack of distributed transaction support across JPA, LevelDB, and a network call to ChromaDB is the most significant risk. Our strategy is to define a primary store (SQLite/JPA) and treat the others as secondary. If a write to a secondary store fails, we must implement compensatory logic to roll back the primary store’s change.
Here is the ingestion flow for a new document:
sequenceDiagram participant App as Application participant Service as HybridRAGDataService participant JPA as JPA (SQLite) participant LevelDB as RawContentStore participant Chroma as VectorStore (ChromaDB) App->>Service: addDocument(content, metadata) Service->>JPA: 1. Persist DocumentMetadata JPA-->>Service: Return generated ID (e.g., 42) Service->>Service: 2. Generate embedding for content alt Successful Ingestion Service->>LevelDB: 3. Put(key=42, value=content) LevelDB-->>Service: OK Service->>Chroma: 4. Add(vector, metadata={"doc_id": "42"}) Chroma-->>Service: OK Service-->>App: DocumentAddedSuccessfully else LevelDB Write Fails Service->>LevelDB: 3. Put(key=42, value=content) LevelDB-->>Service: FAILED Service->>JPA: 5a. Rollback: Delete DocumentMetadata(id=42) JPA-->>Service: OK Service-->>App: IngestionFailedException else ChromaDB Add Fails Service->>Chroma: 4. Add(vector, metadata={"doc_id": "42"}) Chroma-->>Service: FAILED Service->>JPA: 5b. Rollback: Delete DocumentMetadata(id=42) JPA-->>Service: OK Service->>LevelDB: 5c. Rollback: Delete(key=42) LevelDB-->>Service: OK Service-->>App: IngestionFailedException end
The corresponding query flow leverages each store for what it does best:
sequenceDiagram participant App as Application participant Service as HybridRAGDataService participant Chroma as VectorStore (ChromaDB) participant LevelDB as RawContentStore participant JPA as JPA (SQLite) App->>Service: retrieveContext(queryText) Service->>Service: 1. Generate embedding for queryText Service->>Chroma: 2. Query for top-k nearest vectors Chroma-->>Service: Return list of doc_ids ["7", "101", "34"] Service->>LevelDB: 3. Batch Get(keys=["7", "101", "34"]) LevelDB-->>Service: Return map of {id -> raw_content} Service->>JPA: 4. (Optional) Batch FindByIds(["7", "101", "34"]) JPA-->>Service: Return map of {id -> metadata} Service->>Service: 5. Combine content and metadata Service-->>App: Return RichContext objects
Project Foundation and Dependencies
We use Maven to manage the project. The pom.xml
must include dependencies for the SQLite dialect, JPA/Hibernate, the LevelDB JNI wrapper, and the ChromaDB Java client.
<project ...>
<properties>
<maven.compiler.source>17</maven.compiler.source>
<maven.compiler.target>17</maven.compiler.target>
<hibernate.version>6.2.7.Final</hibernate.version>
<sqlite.driver.version>3.42.0.0</sqlite.driver.version>
</properties>
<dependencies>
<!-- JPA and Hibernate for SQLite -->
<dependency>
<groupId>org.hibernate.orm</groupId>
<artifactId>hibernate-core</artifactId>
<version>${hibernate.version}</version>
</dependency>
<dependency>
<groupId>org.xerial</groupId>
<artifactId>sqlite-jdbc</artifactId>
<version>${sqlite.driver.version}</version>
</dependency>
<dependency>
<groupId>org.glassfish.jaxb</groupId>
<artifactId>jaxb-runtime</artifactId>
<version>4.0.3</version>
</dependency>
<!-- LevelDB JNI Wrapper -->
<dependency>
<groupId>org.fusesource.leveldbjni</groupId>
<artifactId>leveldbjni-all</artifactId>
<version>1.8</version>
</dependency>
<!-- ChromaDB Java Client -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-chroma</artifactId>
<version>0.24.0</version>
</dependency>
<!-- A simple embedding model client -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>0.24.0</version>
</dependency>
<!-- Logging -->
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-simple</artifactId>
<version>2.0.7</version>
</dependency>
</dependencies>
</project>
A common pitfall in modern Java is the removal of JAXB from the JDK. We must explicitly add it for Hibernate to function correctly.
Tier 1: Relational Metadata with JPA and SQLite
The first layer is the “source of truth” for document existence. We define a DocumentMetadata
entity. Its primary key will serve as the unique identifier across all three storage systems.
package com.example.rag.storage;
import jakarta.persistence.*;
import java.time.Instant;
@Entity
@Table(name = "document_metadata")
public class DocumentMetadata {
@Id
@GeneratedValue(strategy = GenerationType.IDENTITY)
private Long id;
@Column(name = "source_id", nullable = false, unique = true)
private String sourceId;
@Column(name = "chunk_index", nullable = false)
private int chunkIndex;
@Column(name = "char_count", nullable = false)
private int charCount;
@Column(name = "ingested_at", nullable = false)
private Instant ingestedAt;
// Standard getters and setters
public Long getId() { return id; }
public void setId(Long id) { this.id = id; }
public String getSourceId() { return sourceId; }
public void setSourceId(String sourceId) { this.sourceId = sourceId; }
public int getChunkIndex() { return chunkIndex; }
public void setChunkIndex(int chunkIndex) { this.chunkIndex = chunkIndex; }
public int getCharCount() { return charCount; }
public void setCharCount(int charCount) { this.charCount = charCount; }
public Instant getIngestedAt() { return ingestedAt; }
public void setIngestedAt(Instant ingestedAt) { this.ingestedAt = ingestedAt; }
@PrePersist
public void onPrePersist() {
this.ingestedAt = Instant.now();
}
}
The JPA configuration is handled via persistence.xml
placed in src/main/resources/META-INF
. We configure it to connect to a local file-based SQLite database and automatically create/update the schema.
<persistence xmlns="https://jakarta.ee/xml/ns/persistence"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="https://jakarta.ee/xml/ns/persistence https://jakarta.ee/xml/ns/persistence/persistence_3_0.xsd"
version="3.0">
<persistence-unit name="rag-pu" transaction-type="RESOURCE_LOCAL">
<provider>org.hibernate.jpa.HibernatePersistenceProvider</provider>
<class>com.example.rag.storage.DocumentMetadata</class>
<properties>
<property name="jakarta.persistence.jdbc.driver" value="org.sqlite.JDBC"/>
<property name="jakarta.persistence.jdbc.url" value="jdbc:sqlite:rag_metadata.db"/>
<property name="jakarta.persistence.jdbc.user" value=""/>
<property name="jakarta.persistence.jdbc.password" value=""/>
<property name="hibernate.dialect" value="org.hibernate.community.dialect.SQLiteDialect"/>
<property name="hibernate.hbm2ddl.auto" value="update"/>
<property name="hibernate.show_sql" value="false"/>
<property name="hibernate.format_sql" value="true"/>
</properties>
</persistence-unit>
</persistence>
The repository class for this entity handles the transaction boilerplate.
package com.example.rag.storage;
import jakarta.persistence.EntityManager;
import jakarta.persistence.EntityManagerFactory;
import jakarta.persistence.EntityTransaction;
import java.util.function.Function;
public class MetadataRepository {
private final EntityManagerFactory emf;
public MetadataRepository(EntityManagerFactory emf) {
this.emf = emf;
}
public <T> T executeInTransaction(Function<EntityManager, T> action) {
EntityManager em = emf.createEntityManager();
EntityTransaction tx = em.getTransaction();
try {
tx.begin();
T result = action.apply(em);
tx.commit();
return result;
} catch (RuntimeException e) {
if (tx != null && tx.isActive()) {
tx.rollback();
}
throw e;
} finally {
em.close();
}
}
public void executeInTransaction(java.util.function.Consumer<EntityManager> action) {
executeInTransaction(em -> {
action.accept(em);
return null;
});
}
}
Tier 2: High-Speed Raw Text Access with LevelDB
The second layer is a RawContentStore
that wraps LevelDB. Its interface is minimal: put
, get
, delete
. Lifecycle management (open
, close
) is critical to prevent database corruption.
package com.example.rag.storage;
import org.iq80.leveldb.DB;
import org.iq80.leveldb.DBException;
import org.iq80.leveldb.Options;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.File;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Optional;
import static org.iq80.leveldb.impl.Iq80DBFactory.factory;
public class RawContentStore implements AutoCloseable {
private static final Logger logger = LoggerFactory.getLogger(RawContentStore.class);
private final DB levelDb;
public RawContentStore(String path) {
Options options = new Options();
options.createIfMissing(true);
try {
this.levelDb = factory.open(new File(path), options);
logger.info("LevelDB raw content store opened at {}", path);
} catch (IOException e) {
throw new RuntimeException("Failed to open LevelDB at " + path, e);
}
}
public void put(Long key, String value) {
if (key == null || value == null) {
throw new IllegalArgumentException("Key and value cannot be null");
}
try {
levelDb.put(
String.valueOf(key).getBytes(StandardCharsets.UTF_8),
value.getBytes(StandardCharsets.UTF_8)
);
} catch (DBException e) {
logger.error("Failed to write to LevelDB for key: {}", key, e);
throw new StorageException("LevelDB put operation failed", e);
}
}
public Optional<String> get(Long key) {
if (key == null) {
return Optional.empty();
}
try {
byte[] valueBytes = levelDb.get(String.valueOf(key).getBytes(StandardCharsets.UTF_8));
return (valueBytes != null)
? Optional.of(new String(valueBytes, StandardCharsets.UTF_8))
: Optional.empty();
} catch (DBException e) {
logger.error("Failed to read from LevelDB for key: {}", key, e);
throw new StorageException("LevelDB get operation failed", e);
}
}
public void delete(Long key) {
if (key == null) return;
try {
levelDb.delete(String.valueOf(key).getBytes(StandardCharsets.UTF_8));
} catch (DBException e) {
logger.error("Failed to delete from LevelDB for key: {}", key, e);
// In a production system, you might schedule a retry instead of throwing
throw new StorageException("LevelDB delete operation failed", e);
}
}
@Override
public void close() {
try {
levelDb.close();
logger.info("LevelDB raw content store closed.");
} catch (IOException e) {
logger.error("Error closing LevelDB", e);
}
}
// Custom exception for storage layer
public static class StorageException extends RuntimeException {
public StorageException(String message, Throwable cause) {
super(message, cause);
}
}
}
A custom StorageException
allows the service layer to catch errors specific to persistence operations. The keys are the Long
IDs from the SQLite database, converted to byte arrays.
Tier 3: Semantic Search with ChromaDB
The vector store component interfaces with a ChromaDB instance. For an embeddable system, we can launch the Chroma server as a subprocess or connect to a local instance started manually (chroma run --path /path/to/db
). The langchain4j-chroma
library provides a convenient client.
package com.example.rag.storage;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import dev.langchain4j.store.embedding.EmbeddingStore;
import dev.langchain4j.store.embedding.chroma.ChromaEmbeddingStore;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.List;
public class VectorStore {
private static final Logger logger = LoggerFactory.getLogger(VectorStore.class);
public static final String DOC_ID_METADATA_KEY = "doc_id";
private final EmbeddingStore<TextSegment> embeddingStore;
public VectorStore(String collectionName, String baseUrl, int timeoutSeconds) {
this.embeddingStore = ChromaEmbeddingStore.builder()
.baseUrl(baseUrl)
.collectionName(collectionName)
.timeout(java.time.Duration.ofSeconds(timeoutSeconds))
.build();
logger.info("ChromaDB vector store connected to {} for collection '{}'", baseUrl, collectionName);
}
public String add(Long documentId, Embedding embedding, String originalText) {
TextSegment segment = TextSegment.from(originalText);
// We embed the documentId into the metadata. This is the crucial link back to our other stores.
segment.metadata().put(DOC_ID_METADATA_KEY, String.valueOf(documentId));
try {
return embeddingStore.add(embedding, segment);
} catch (Exception e) {
logger.error("Failed to add embedding to ChromaDB for doc ID: {}", documentId, e);
throw new StorageException("ChromaDB add operation failed", e);
}
}
public List<EmbeddingMatch<TextSegment>> findRelevant(Embedding queryEmbedding, int maxResults) {
try {
return embeddingStore.findRelevant(queryEmbedding, maxResults);
} catch (Exception e) {
logger.error("Failed to query ChromaDB", e);
throw new StorageException("ChromaDB query operation failed", e);
}
}
// Custom exception for storage layer, can be the same as RawContentStore's
public static class StorageException extends RuntimeException {
public StorageException(String message, Throwable cause) {
super(message, cause);
}
}
}
The most critical design decision here is embedding the documentId
from SQLite into the ChromaDB metadata. This allows us to retrieve the primary key after a vector search and then use it to fetch the full data from JPA and LevelDB.
The Hybrid Service: Tying It All Together
The HybridRAGDataService
orchestrates these three components. Its methods implement the logic from the sequence diagrams, including the crucial compensation logic for handling failures.
package com.example.rag.service;
import com.example.rag.storage.*;
import dev.langchain4j.data.embedding.Embedding;
import dev.langchain4j.data.segment.TextSegment;
import dev.langchain4j.model.embedding.EmbeddingModel;
import dev.langchain4j.store.embedding.EmbeddingMatch;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.List;
import java.util.Optional;
import java.util.stream.Collectors;
public class HybridRAGDataService {
private static final Logger logger = LoggerFactory.getLogger(HybridRAGDataService.class);
private final MetadataRepository metadataRepo;
private final RawContentStore contentStore;
private final VectorStore vectorStore;
private final EmbeddingModel embeddingModel;
public HybridRAGDataService(
MetadataRepository metadataRepo,
RawContentStore contentStore,
VectorStore vectorStore,
EmbeddingModel embeddingModel) {
this.metadataRepo = metadataRepo;
this.contentStore = contentStore;
this.vectorStore = vectorStore;
this.embeddingModel = embeddingModel;
}
public DocumentMetadata addDocument(String sourceId, int chunkIndex, String content) {
logger.info("Starting ingestion for sourceId: {}, chunkIndex: {}", sourceId, chunkIndex);
// Step 1: Persist metadata to get a generated ID. This is our anchor.
DocumentMetadata metadata = new DocumentMetadata();
metadata.setSourceId(sourceId);
metadata.setChunkIndex(chunkIndex);
metadata.setCharCount(content.length());
DocumentMetadata persistedMetadata = metadataRepo.executeInTransaction(em -> {
em.persist(metadata);
return metadata;
});
Long docId = persistedMetadata.getId();
logger.debug("Persisted metadata for docId: {}", docId);
try {
// Step 2: Store raw content in LevelDB
contentStore.put(docId, content);
logger.debug("Stored raw content in LevelDB for docId: {}", docId);
// Step 3: Generate embedding and store in ChromaDB
Embedding embedding = embeddingModel.embed(content).content();
vectorStore.add(docId, embedding, content);
logger.debug("Stored vector in ChromaDB for docId: {}", docId);
return persistedMetadata;
} catch (Exception e) {
logger.error("Ingestion failed for docId: {}. Initiating rollback.", docId, e);
// Compensation logic: Rollback previous steps
// A more robust implementation might use a state machine or a job queue
// to handle retries for transient errors.
try {
contentStore.delete(docId); // Attempt to delete from LevelDB
} catch(Exception cleanupEx) {
logger.error("Rollback failed for LevelDB content on docId: {}. Manual cleanup may be required.", docId, cleanupEx);
}
try {
metadataRepo.executeInTransaction(em -> {
DocumentMetadata toDelete = em.find(DocumentMetadata.class, docId);
if (toDelete != null) {
em.remove(toDelete);
}
});
} catch(Exception cleanupEx) {
logger.error("CRITICAL: Rollback failed for metadata on docId: {}. Database is in an inconsistent state.", docId, cleanupEx);
}
throw new IngestionException("Failed to ingest document, rollback initiated.", e);
}
}
public List<QueryResult> search(String query, int topK) {
Embedding queryEmbedding = embeddingModel.embed(query).content();
List<EmbeddingMatch<TextSegment>> matches = vectorStore.findRelevant(queryEmbedding, topK);
return matches.stream()
.map(match -> {
String docIdStr = match.embedded().metadata().getString(VectorStore.DOC_ID_METADATA_KEY);
Long docId = Long.parseLong(docIdStr);
// Fetch the raw content from LevelDB. This should be very fast.
Optional<String> content = contentStore.get(docId);
return new QueryResult(
docId,
match.score(),
content.orElse(""),
match.embedded().text() // Chroma can also store the text, but we rely on LevelDB as primary source
);
})
.collect(Collectors.toList());
}
public static class QueryResult {
public final Long documentId;
public final double score;
public final String content;
public final String matchedSegment; // from Chroma
// Constructor, getters, etc.
public QueryResult(Long documentId, double score, String content, String matchedSegment) {
this.documentId = documentId;
this.score = score;
this.content = content;
this.matchedSegment = matchedSegment;
}
}
public static class IngestionException extends RuntimeException {
public IngestionException(String message, Throwable cause) {
super(message, cause);
}
}
}
The error handling in addDocument
is the most important part of this service. A failure in any step after the initial database insert triggers a series of compensating actions to clean up the other stores. A production-grade system would need to be even more robust, potentially flagging records for a separate cleanup process if a rollback itself fails, to avoid leaving orphaned data.
Testing the Implementation
A unit or integration test can validate the entire flow. This test would initialize all three storage layers, add a document, and then perform a search to ensure the correct data is retrieved.
// Example usage in a main method or test
public static void main(String[] args) throws IOException {
// Setup - In a real app, this would be managed by a DI framework
EntityManagerFactory emf = Persistence.createEntityManagerFactory("rag-pu");
MetadataRepository metadataRepo = new MetadataRepository(emf);
// Use temporary directories for testing
Path tempDir = Files.createTempDirectory("rag_test");
RawContentStore contentStore = new RawContentStore(tempDir.resolve("leveldb").toString());
VectorStore vectorStore = new VectorStore("test_collection", "http://localhost:8000", 60);
// Mock or real embedding model
EmbeddingModel embeddingModel = new OpenAiEmbeddingModel.OpenAiEmbeddingModelBuilder()
.apiKey(System.getenv("OPENAI_API_KEY"))
.modelName("text-embedding-3-small")
.build();
HybridRAGDataService service = new HybridRAGDataService(metadataRepo, contentStore, vectorStore, embeddingModel);
// Run test
try {
String content1 = "The Catcher in the Rye is a novel by J. D. Salinger.";
service.addDocument("salinger_bio_1", 0, content1);
String content2 = "SQLite is a C-language library that implements a small, fast, self-contained, high-reliability, full-featured, SQL database engine.";
service.addDocument("tech_doc_1", 0, content2);
System.out.println("Ingestion successful.");
List<HybridRAGDataService.QueryResult> results = service.search("structured query language", 1);
System.out.println("Search Results:");
results.forEach(r -> {
System.out.printf("ID: %d, Score: %.4f, Content: %s%n", r.documentId, r.score, r.content);
});
} catch (Exception e) {
e.printStackTrace();
} finally {
// Cleanup
contentStore.close();
emf.close();
}
}
Before running this, one must have a ChromaDB server running locally (chroma run
). The test demonstrates the successful orchestration: data is written across three systems, and a query correctly identifies the relevant document ID from the vector store and retrieves the corresponding raw content from the key-value store.
The primary limitation of this architecture is its explicit design for single-node, embeddable deployments. The manual compensation logic for transaction failures introduces complexity and potential for inconsistency if a rollback action fails. It cannot scale horizontally without significant redesign; for a distributed system, one would likely converge on a managed database solution like PostgreSQL with pgvector, or fully distributed services for each data type. Furthermore, the lifecycle management of the ChromaDB process needs to be robustly handled by the parent Java application, including startup, shutdown, and monitoring, which adds operational overhead. This architecture, however, represents a highly optimized solution for its specific niche: creating powerful, self-contained AI applications that do not require external database infrastructure.