OpenMed: Free On-Device Clinical AI, AWS Alternative

OpenMed is a free, Apache-2.0 clinical AI toolkit with 2,400 GitHub stars that runs 1,000-plus specialized medical models entirely on your own hardware, extracting entities from clinical notes, detecting PHI, and de-identifying patient records without ever sending data to a vendor. AWS Comprehend Medical charges $0.01 per unit for entity detection, where one unit is 100 characters of text. OpenMed replaces that bill with electricity and a download.

OpenMed is a free, Apache-2.0 clinical AI toolkit that extracts medical entities from clinical notes, detects and strips all 18 HIPAA Safe Harbor identifiers, and classifies diagnoses against ICD-10 and SNOMED-CT, running entirely on your own hardware, replacing the cloud medical NLP services that AWS Comprehend Medical charges $0.01 per unit for, where one unit is 100 characters of text. For a healthcare organization processing 10,000 clinical notes per month at an average of 3,000 characters per note, that is roughly $30,000 in AWS bills. With OpenMed, it is a server and a download.

What you are currently paying for

AWS Comprehend Medical is the dominant cloud service for clinical NLP. Its entity detection API charges $0.01 per unit for the first million units, where a unit is 100 characters. PHI detection runs $0.0014 per unit. SNOMED-CT inference is $0.0075 per unit. These fees stack when you run multiple operations on the same note, and they add up at anything approaching real healthcare volume.

The business case for cloud medical NLP is convenience: you send text to an API endpoint and get back structured data. But there is a second cost that does not appear on the bill. Every piece of clinical text sent to that endpoint leaves your network. For organizations under HIPAA, that means a Business Associate Agreement, compliance review, and ongoing audit exposure. For international organizations under GDPR, it often means the data simply cannot leave the country at all.

Most organizations using AWS Comprehend Medical are not doing it because they prefer the cloud. They are doing it because building local clinical NLP has historically required ML engineers and months of work. OpenMed removes that constraint.

What OpenMed actually does

OpenMed is a Python library and optional REST service that gives you access to 1,000-plus curated biomedical and clinical models through a single API. You call one function with clinical text and a model name, and you get back structured entities with confidence scores, all running locally on your hardware.

The core capabilities map almost directly to what AWS Comprehend Medical sells:

Clinical named entity recognition pulls out diseases, drugs, dosages, procedures, and lab values from unstructured notes. The library ships a model called diseasedetectionsuperclinical that identifies conditions and their context in one call. PHI detection covers names, dates, phone numbers, addresses, account numbers, and all 18 Safe Harbor identifiers. The de-identification pipeline does not just detect, it replaces real values with format-preserving fakes, so a note that says "John Smith, DOB 04/12/1962" comes out as "Michael Torres, DOB 07/23/1981," which is structurally intact for downstream processing but contains no real patient data.

Beyond the core NLP functions, OpenMed adds things AWS Comprehend Medical does not have: a batch processor for parallel document-level operations, a Docker REST service for team deployments, and a native iOS and macOS SDK through OpenMedKit for Swift applications. It runs on CPU, CUDA, and Apple Silicon, with MLX acceleration on M-series hardware.

The self-hosting economics

The Apache-2.0 license covers everything, including commercial use. Installation is a single pip command. The REST service ships as a Docker container. There are no per-call fees, no character caps, and no vendor relationship required.

The realistic costs are hardware and engineering time. Clinical NLP models are not trivial in size. The smaller models in the library run adequately on CPU-only machines, but the higher-quality entity extraction models run meaningfully faster on a GPU or Apple Silicon. For a clinic processing notes in real time, you probably want a dedicated server. A GPU instance on AWS runs $0.30 to $1.50 per hour, depending on size and region. Dedicated on-premises hardware runs anywhere from a few thousand dollars for a refurbished GPU workstation to tens of thousands for a production server.

The break-even calculation depends entirely on volume. At low volume (under a few hundred notes per month), the hosted AWS service is actually cheaper when you factor in the engineering time to stand up and maintain a self-hosted deployment. At medium volume, say 1,000 to 5,000 notes per month, it becomes roughly equivalent. Above that, OpenMed is substantially cheaper, and the compliance and privacy advantages are independent of volume.

What to know before switching

Setup is more involved than pointing at an API endpoint. You need Python 3.10 or above, and model files download on first use, which takes time depending on which models you select. The catalog spans 1,000-plus models across specialties and languages, but you load them by name on demand rather than downloading the whole library.

The quality story is competitive. The repository includes benchmark comparisons showing several OpenMed models outperforming proprietary cloud APIs on standard clinical NLP benchmarks. That said, quality varies by model and use case. Any organization with high-stakes clinical workflows should run their own evaluation on representative data before replacing a production system.

One real gap: OpenMed does not provide downstream workflow integration out of the box. AWS Comprehend Medical plugs into the broader AWS ecosystem. OpenMed gives you the NLP primitives and leaves integration to you. That is fine if your team has engineering capacity, but not if you need a turnkey system.

The project is actively maintained. v1.5.5 shipped June 8, 2026 with batch PII support and a hardened security model that closed a remote code execution path from an earlier release. When the payload is patient data, that kind of transparency matters.

Where this fits

The most direct use case is any organization that processes clinical text at meaningful volume, has engineering resources to stand up a Python service, and needs patient data to stay on-premises. That covers hospital systems, telehealth companies, clinical research organizations, and healthcare software vendors building NLP into their own products.

For software vendors, the math is compelling. AWS Comprehend Medical creates a variable per-unit cost that scales with customer usage. OpenMed creates a fixed infrastructure cost. At product scale, those are very different businesses.

The HIPAA angle is not secondary. An on-device system that processes clinical notes without any external call removes an entire category of compliance risk that a cloud API, however well-documented its BAA, introduces by definition.

Self-hosted clinical AI has been theoretically possible for years. What changed is that it became practical. A library that installs with pip, ships a thousand curated models, and benchmarks competitively against cloud APIs is a different kind of option than what existed in 2021. The question most organizations face is not whether this can work. It is whether to make the investment before the next AWS invoice or after.

OpenMed: The Free, On-Device Clinical AI That Makes AWS Comprehend Medical Optional

What you are currently paying for

What OpenMed actually does

The self-hosting economics

What to know before switching

Where this fits