Sunami's Legal AI Benchmark Challenge

2 points by sunami-ai 11 hours ago

We're inviting anyone interested in Legal AI to compare the output from O1 Pro or O3 to the output from our Legal AI to determine which does a better analysis for any of the contracts you see here in our public feed (just started, will add more as we go.) For example, you can load up the YC SAFE (Cap, No Discount) and see how it differs across O1/Pro, O3-mini/high and Sunami (see the YC SAFE in our feed)

If O1/Pro and O3 do not match the accuracy of our analysis, negotiation advise, and risk scoring (on spiral slide, click on any circle to see) with AIs that cost BILLIONS then we win a trophy, alright?

We tried contacting CodeX, The Stanford Center for Legal Informatics, on their LinkedIn page. No response as of yet.

The legal infographics for each of the agreements were generated by which creates the analysis and outputs the infographics

We had attempted to apply to YC but our application was 5 minutes late. Plus we are older than the typical age for YC founders, so we are not holding hope.

Having said that, we have began courting a close friend of one of our founders who happens to be a former deputy AG of California and former AG of another major US state, specializing in fighting corporate fraud and corruption. Funny enough, he has advised us to stay away from the fight for justice and against corruption and focus on being a useful tool for startups to reduce their contract review costs.

For now, what we tell potential investors is that "we make art, not money", but we do have some plans in the real estate market (not very exciting compared to exposing corporate corruption and greed.)

Hope you like, and can tell us if O3-mini-high or O1 Pro do any better. In our casual testing they did no cover as much ground and could not produce reliable and sensical risk scores. All over the place.

We have a few lawyers as advisors but always looking for more eye balls and friendly curiosity.