International collaboration with combined AI and legal expertise
Noxtua Voyage Embed outperforms Open AI’s most powerful search model on average by a factor of 2 on legal text benchmarks
Noxtua Voyage Embed can be used to build custom AI search solutions
BERLIN/STANFORD, September 20, 2024 – The Legal AI Oxford spin-off Xayn, Stanford spin-off Voyage AI and the German legal database dejure.org jointly launch Noxtua Voyage Embed, a fine-tuned legal search embedding model specialized in European and German law. The search model is trained on a dataset, which comprises around 20 billion tokens of legal texts from dejure.org and Europe's first sovereign Legal AI Noxtua, developed by Xayn with legal expertise from the international law firm CMS. The newly released legal search model can process up to 32k tokens context length, making it really useful for processing legal texts, which tend to be lengthy.
To validate its quality, Noxtua Voyage Embed (voyage-law-2-xayn) was tested on several legal text benchmarks, where it performed more than 25 points (or 1.7 times) better than Open AI's most powerful search model (text-embedding-3-large) in terms of the search accuracy, and 2.2 times better with regards to the model’s ranking quality with a 3x smaller dimensionality (3x smaller storage and latency in vector-based search).
This international partnership combines deep AI expertise based on the latest AI research with in-depth legal specialization:
Voyage AI is a leading developer of customized embedding models and LLM retrieval/search infrastructure and already gained extensive experience in building custom legal embeddings working with Harvey.AI, a US-American legal AI platform.
dejure.org is one of the most used legal services in Germany with around 15 million hits per month and a database with more than 2 million court decisions and the most relevant laws in practice. For over twenty years, dejure.org has stood for the consistent combination of legal expertise with the integration and networking of legal information and technology.
Berlin-based Xayn offers the perfect combination of deep AI and legal insights with Europe's first sovereign Legal AI Noxtua, which they developed with the internationally renowned law firm CMS to provide lawyers with a legally competent and compliant AI assistant. Hosted in the EU, Noxtua meets not only GDPR requirements but also high legal standards of professional confidentiality and privacy.
Embedding models are used to build retrieval-augmented-generated AI systems (RAG) which are needed when creating efficient and reliable AI search solutions. They complement more traditional strategies such as keyword searching by finding items based on their semantic meaning. Noxtua Voyage Embed can be used to build custom AI search solutions focused on legal topics, as it has been trained on high-quality legal documents and is therefore specialized in the legal domain.
Enabling the creation of custom legal AI search solutions
“We are excited to launch Noxtua Voyage Embed together with Voyage AI and dejure.org, as this will drive the development of specialized and high-quality AI solutions. It will also allow us to further refine Noxtua, Europe's first sovereign legal AI, to support lawyers in their daily work. We see this international collaboration between established experts in their specific fields as another confirmation of our quest to promote independent AI with European values that meets the highest standards of competence and compliance. It’s the perfect 'source global, host local’ approach,” highlights Dr. Leif-Nissen Lundbæk (CEO & Co-Founder Xayn).
“This collaboration between Voyage AI, Xayn, and dejure.org combines excellent AI and legal knowledge with two research-based AI startups from Oxford and Stanford, two of the most prestigious universities in the world, and one of the most important data bases for legal documents in Germany. At Voyage, we focus on developing new techniques so that models can capture the nuances of specialized texts similarly to domain experts. The legal domain is very interesting and challenging for AI development because it is such a rule-based system that relies on accuracy and very precise use of language. Therefore, it is exciting to work on AI solutions that require deep technical and domain-specific knowledge, while also being practical,” states Tengyu Ma (CEO & Co-Founder Voyage AI, Assistant Professor at Stanford University).
“Searching for and in legal sources such as judgments and laws and the consolidation and evaluation of this information are essential components of legal work. To do this, lawyers need reliable and efficient solutions that they can trust. However, conventional AI offerings are usually trained on general data and therefore struggle immensely to achieve reliable results in such a highly specialized field as law. That's why we're excited to partner with Xayn and Voyage AI on this new and powerful legal search model. Noxtua Voyage Embed enables experts to develop customized AI applications on top of it that help legal professionals find relevant information faster and easier,” says Oliver García (CEO dejure.org).
Noxtua Voyage Embed as part of Europe’s first sovereign Legal AI Noxtua
The new Noxtua Voyage Embed search model is now available via API and can be used to build custom Legal AI search solutions.
Noxtua Voyage Embed is part of Xayn's larger sovereign Legal AI offering, which also includes Noxtua Legal Copilot, which lawyers can use to analyze, review, translate, and (re)write legal documents while protecting their clients' privacy. Europe's first sovereign legal AI is legally competent, having been trained on legal texts labeled by experts. In addition to being GDPR-compliant, Noxtua is hosted in the EU and meets the high requirements of professional secrecy (e.g. Section 203 German Criminal Code, Section 43e German Federal Code for Lawyers).