01: MidShip
MidShip is helping extract documents straight into your spreadsheets
What do you get when a mediocre tennis player*, a recent immigrant grappling with a sense of belonging, and a software engineer walk into a cafe? Apparently, a Y Combinator-backed startup. Their background in financial and data products and services at PayPal, Instacart, and Deloitte, have fueled their mission to streamline capturing data from documents like PDFs right into your existing spreadsheets.
*Kieran’s words
Disclaimer: This is a company focussed analysis that includes my own assumptions which are neither confirmed or denied by the company themselves and may contain inaccuracies.
👤 Core Customer (ICP)
A product for everyone is a product for no one
Who: Analysts: MidShip is signalling a few use cases across different industries (Finance, Real Estate, Medical research).
Their examples seem to focus more on Real Estate and Finance though and I like the targeting even discounting the team’s experience. Why? These tasks are often:
Frequent: there's lots of data being ingested all the time (text, tabular and financial images). These images need to be converted to numbers.
Urgent: brutal hours - Investment Bankers and Private Equity (P/E) analysts doing 80-100 hour weeks to get the work done. What kind of non-urgent work are you staying in the office past 7pm to do?
Expensive: these firms make a lot and the cost of getting it wrong is high (the wrong number in the model can make or break a decision)
Within Finance, lets turn the Frequency dial up some more. Investment Banks will more likely deal with public companies (ie. have a stock symbol) than Real Estate or Private Equity companies. Public companies data is cleaner and more likely already in spreadsheets due to compliance and data infrastructure built out for the industry. Thus, I’d assume Private Equity data is messier and has to deal with manual drudgery more often.
Also, I’ll assume regardless of what area of Finance, the people doing this spreadsheet grunt-work are more typically junior and summer interns so I’d look for that person as an end-user.
🎯 Junior Analysts at Private Equity (P/E) Companies who are analyzing non-public information
💊 Underserved Needs:
“Know your customers better than they know themselves” -Steve Jobs
In practice this will be filled in with real customer data through discovery and research but the below is just a thought exercise of some hypothetical needs:
P/E Analyst: Minimize the time it takes to update our companies proprietary models so I can go home at midnight instead of 2am. 🥲
P/E Analyst: Maximize the percentage of time I’m spending doing enjoyable, skilled work so I can progress in my career. 📈
B2B Buyers: Help me attract more junior talent by promising better work life balance without sacrificing our units bottomline (and my bonus… 😉)
Other table stakes things:
B2B Buyers: Pass my organization's security requirements ☑️
B2B Buyers: Don’t screw up (ie. have a transcription error that blows up in my face) ☑️
I’d argue the P/E Analyst doesn’t need too much motivation to use this tool. They are desperate for a solution! It’s the B2B Buyer(s) that are more important to convince.
📦 Product / Offering / Value Proposition:
The solution is as follows:
Upload your source documents (PDFs, reports, etc.)
Upload your target spreadsheet template
Midship extracts the required data and maps it to your template
Download your fully populated spreadsheet, ready for analysis in minutes
Death to tedium. As someone who has spent hundreds of hours doing manual copy-paste & transcription over my life it feels like a magic API. Walk away, make a coffee, tada!
Value proposition:
For B2B Buyers, MidShip speaks briefly to Security on the website, but to what degree do they match the requirements of these organizations? Reliability isn’t mentioned on MidShip’s site, perhaps its not a concern for the buyers like I’d imagine it would be.
For P/E Analyst Users, if the conversion takes 30 minutes as per this video it might not fit for all time saving situations, unless there is some batch operations or improvements over time. I think this is more than fine for the start though.
📈 The Business
“A Product Manager’s job is to create value for the customer and the business” - Marty Cagan
Why Now? Gen AI, duh! But specifically, I think this is unlocked by LVLMs (Large Vision Language Models) seen in GPT-4o. LVLMs are less mature than LLMs which means more open space.
Economics: I believe they work for the Use Case of the ICP but wonder about AI inference costs on long documents especially with lots of images. My research suggests ~$0.02 to input and ~$0.04 output a page of information with an LLM and those costs creep up with images (LVLM). These costs are minor versus the cost of human production but 1000 pages would be $60 would be meaningful.
Go To Market: As part of GTM it appears they approaching with partnerships and potential solutions. I like this approach as human-touch will help with customer discovery as well maybe Reliability!
Moat: I don’t see one today but providing the best solution to a small set of customers should make it hard for others to enter.
🦺 Risks
When you can remove risk, do it, when you can’t, reduce it
Aside from what I’ve mentioned that is top of mind for the value proposition:
Competition: this is a highly competitive space from open source to big tech. Niching is typically important for a startup, I think its particularly important here.
💬 Over To You
Let me know one thing you like about this piece and one thing you’d change.
Stay tuned for the next edition!








