Skip to main content


Does anyone know of an #OpenAccess full-text #PDF #search engine/tool using which I can search for relevant PDFs from a self-hosted #database?

Context: we have a curated database of #research articles but so far our search capability has been limited to tagged keywords or title and abstract field search only. We'd like to be able to search the entire PDF.

Side note: I know that PDFs are not a great way to store scientific information. I'd prefer not to use a proprietary #LLM if possible

#LexicalSearch #SemanticSearch #AskAcademia #academia #science #sciences #ScienceMastodon #AskFedi #OpenScience

in reply to manisha

Not sure if that really fits what you want but #Zotero does full-text search of PDFs: zotero.org/support/searching
in reply to El Duvelle

thank you!! It looks like a good option. Someone else also suggested it. I've used it as a reference manager but never dived into its full-text indexing feature. Do you happen to know if Zotero libraries can be made public?
This entry was edited (3 days ago)
in reply to manisha

I haven't done it myself but it looks like it's just a matter of setting the library's visibility to "public": forums.zotero.org/discussion/8…
in reply to El Duvelle

@elduvelle

Ya ya zotero would def be how I'd do it. Do you need it to be public so that it can be publicly full text searchable from a website or smth? Slightly different problem than being able to do it on your own zotero client but I'd still use zotero as the basis of the collection