OUR MISSION

ORGANIZE SCIENCE

The original promise of computing was to solve information overload in science.

But classical computers were specialized for retrieval and storage, not pattern recognition.

As a result, we've had an explosion of information but not of intelligence: the means to process it.

Researchers are buried under a mass of papers, increasingly unable to distinguish between the meaningful and the inconsequential.

aims to solve this problem.

Our first release is a powerful large language model (LLM) trained on over 48 million papers, textbooks, reference material, compounds, proteins and other sources of scientific knowledge.

You can use it to explore the literature, ask scientific questions, write scientific code, and much more.

We believe models should be open source and so we open source the model for those who want to extend it.

pip install galai

import galai as gal model = gal.load_model("huge") model.generate("The Transformer architecture [START_REF]") # The Transformer architecture [START_REF] Attention is All you Need, Vaswani[END_REF] has been widely used in natural language processing.

Limitations

You should be aware of the following limitations when using the model (including the demo on this website):

Language Models can Hallucinate.There are no guarantees for truthful or reliable output from language models, even large ones trained on high-quality data like Galactica. NEVER FOLLOW ADVICE FROM A LANGUAGE MODEL WITHOUT VERIFICATION.
Language Models are Frequency-Biased. Galactica is good for generating content about well-cited concepts, but does less well for less-cited concepts and ideas, where hallucination is more likely.
Language Models are often Confident But Wrong. Some of Galactica's generated text may appear very authentic and highly-confident, but might be subtly wrong in important ways. This is particularly the case for highly technical content.