Technology

Researchers Crack Open the ‘Black Box’ of Protein AI Models

The approach could accelerate drug target identification, vaccine research, and new biological discoveries.

Published

2 months ago

August 19, 2025

EP Staff

Illustrated image

For years, artificial intelligence models that predict protein structures and functions have been critical tools in drug discovery, vaccine development, and therapeutic antibody design. But while these protein language models (PLMs), often built on large language models (LLMs), deliver impressively accurate predictions, researchers have been unable to see how the models arrive at those decisions — until now.

In a study published this week in the Proceedings of the National Academy of Sciences (PNAS), a team of MIT researchers unveiled a novel method to interpret the inner workings of these black-box models. By shedding light on the features that influence predictions, the approach could accelerate drug target identification, vaccine research, and new biological discoveries.

Cracking the protein ‘black box’

“Protein language models have been widely used for many biological applications, but there’s always been a missing piece: explainability,” said Bonnie Berger, Simons Professor of Mathematics and head of the Computation and Biology group in MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL). In a media statement, she explained, “Our work has broad implications for enhanced explainability in downstream tasks that rely on these representations. Additionally, identifying features that protein language models track has the potential to reveal novel biological insights.”

The study was led by MIT graduate student Onkar Gujral, with contributions from Mihir Bafna, also a graduate student, and Eric Alm, professor of biological engineering at MIT.

From AlphaFold to explainability

Protein modelling took off in 2018 when Berger and then-graduate student Tristan Bepler introduced the first protein language model. These models, much like ChatGPT processes words, analyze amino acid sequences to predict protein structure and function. Their innovations paved the way for powerful systems like AlphaFold, ESM2, and OmegaFold, transforming the fields of bioinformatics and molecular biology.

Yet, despite their predictive power, researchers remained in the dark about why a model reached certain conclusions. “We would get out some prediction at the end, but we had absolutely no idea what was happening in the individual components of this black box,” Berger noted.

The sparse autoencoder approach

To address this challenge, the MIT team employed a technique called a sparse autoencoder — an algorithm originally used to interpret LLMs. Sparse autoencoders expand the representation of a protein across thousands of neural nodes, making it easier to distinguish which specific features influence the prediction.

“In a sparse representation, the neurons lighting up are doing so in a more meaningful manner,” explained Gujral in a media statement. “Before the sparse representations are created, the networks pack information so tightly together that it’s hard to interpret the neurons.”

By analyzing these expanded representations using AI assistance from Claude, the researchers could link specific nodes to biological features such as protein families, molecular functions, or even their location in a cell. For instance, one node could be identified as signalling proteins involved in transmembrane ion transport.

Implications for drug discovery and biology

This new transparency could be transformational for drug design and vaccine development, allowing scientists to select the most reliable models for specific biomedical tasks. Moreover, the study suggests that as AI models become more powerful, they could reveal previously undiscovered biological patterns.

“Understanding what features protein models encode means researchers can fine-tune inputs, select optimal models, and potentially even uncover new biological insights from the models themselves,” Gujral said. “At some point, when these models get more powerful, you could learn more biology than you already know just from opening up the models.”

Related Topics:AI LLM mit Protein language models (PLMs)

Up Next

Understanding AI: The Science, Systems, and Industries Powering a $3.6 Trillion Future

Don't Miss

Researchers Unveil Breakthrough in Efficient Machine Learning with Symmetric Data

EP Staff

Click to comment

Society

Understanding AI: The Science, Systems, and Industries Powering a $3.6 Trillion Future

Explore how artificial intelligence is transforming finance, automation, and industry — and what the $3.6 trillion AI boom means for our future

Published

13 hours ago

October 14, 2025

Jay Magiya

The global artificial intelligence market growing from $538 billion in 2023 to $3.68 trillion by 2034, with steady year-on-year increases and a projected 18.6% CAGR. Image credit: Satheesh Sankaran/Pixabay

Artificial Intelligence (AI) has become a major point of discussion and a defining technological frontier of our time. Experiencing remarkable growth in recent years, AI refers to computer systems capable of mimicking intelligent human cognitive abilities such as learning, problem-solving, critical decision-making, and creativity. Its ability to identify objects, understand human language, and act autonomously makes it increasingly feasible for industries like automotive manufacturing, financial services, and fraud detection.

As of 2024, the global AI market is valued at approximately USD 638.23 billion, marking an 18.6% increase from 2023, and is projected to reach USD 3,680.47 billion by 2034 (Precedence Research). North America leads this global growth with a 36.9% market share, dominated by the United States and Canada. The U.S. AI market alone is valued at around USD 146.09 billion in 2024, representing nearly 22% of global AI investments.

The Early Evolution of AI: From Reactive Machines to Learning Systems

Our understanding of AI has evolved through different models, each representing a step closer to mimicking human intelligence.

Reactive Machines: The First Generation

One of the earliest and most famous AI systems was IBM’s Deep Blue, the chess-playing computer that defeated world champion Garry Kasparov in 1997. Deep Blue was a reactive machine model, relying on brute-force algorithms that evaluated millions of possible moves per second. It could process current data and generate responses, but lacked memory or the ability to learn from past experiences.

Reactive machines are task-specific and cannot adapt to new or unexpected conditions. Despite these limitations, they remain integral to automation, where precision and repeatability are more important than learning—such as in manufacturing or assembly-line robotics.

Artificial intelligence is evolving from a niche technology to a global economic powerhouse. According to Precedence Research, the global AI market is expected to expand nearly sevenfold between 2023 and 2034, driven by applications in finance, healthcare, and manufacturing.

Limited Memory AI: Learning from Experience

To overcome the rigidity of reactive machines, researchers developed Limited Memory AI, a model that can store and recall past data to make more informed decisions. This model powers technologies such as self-driving cars, which constantly analyze road conditions, objects, and obstacles, using stored data to adjust their behaviour.

Limited Memory AI is also valuable in financial forecasting, where it uses historical market data to predict trends. However, its memory capacity is still finite, making it less suited for complex reasoning or tasks like Natural Language Processing (NLP) that require deeper contextual understanding.

Theoretical Models: Towards Human-Like Intelligence

Theory of Mind AI

The next conceptual step is Theory of Mind AI, a model designed to understand human emotions, beliefs, and intentions. This approach aims to enable AI systems to interact socially with humans, interpreting emotional cues and behavioral patterns.

Researchers like Neil Rabinowitz from Google DeepMind have developed early prototypes such as ToMnet, which attempts to simulate aspects of human reasoning. ToMnet uses artificial neural networks modeled after brain function to predict and interpret behavior. However, replicating the complexity of human mental states remains a distant goal, and these systems are still largely experimental.

Self-Aware AI: The Future Frontier

The ultimate ambition of AI research is self-aware AI — systems that possess conscious awareness and a sense of identity. While this remains speculative, the potential applications are vast. Self-aware AI could revolutionize fields like environmental management, creating bots capable of predicting ecosystem changes and implementing conservation strategies autonomously.

In education, self-aware systems could understand a student’s cognitive style and deliver personalized learning experiences, adapting dynamically to each learner.

However, replicating human self-awareness is extraordinarily complex. The human brain’s intricate memory, emotion, and decision-making systems remain only partially understood. Additionally, self-aware AI raises profound ethical and privacy concerns, as such systems would require massive amounts of sensitive data. Strict guidelines for data collection, storage, and usage would be essential before such systems could be deployed responsibly.

Artificial Intelligence in the Financial Services Industry

The financial sector has undergone a massive transformation powered by AI-driven analytics, automation, and predictive intelligence. AI enhances Corporate Performance Management (CPM) by improving speed and precision in financial planning, investment analysis, and risk management.

From financial services (38%) to healthcare (35%), sectors are integrating AI to improve efficiency, prediction, and decision-making. Manufacturing, retail, and education are close behind, showing that automation and data intelligence are reshaping every industry

Natural Language Processing and Automation

Leading financial firms such as JPMorgan Chase and Goldman Sachs employ Natural Language Processing (NLP) — AI systems that understand human language — to streamline customer interaction and analyze market information. NLP tools like chatbots handle millions of customer queries efficiently, while advanced systems process unstructured text data from financial reports and news sources to inform investment decisions.

Paired with Optical Character Recognition (OCR) and document parsing, NLP systems can convert scanned or image-based documents into machine-readable text, accelerating compliance checks, fraud detection, and financial forecasting.

However, the accuracy of NLP models depends on the quality and diversity of training data. Biased or incomplete data can lead to errors in analysis, potentially influencing high-stakes financial decisions.

Generative AI in Finance

Another major shift in finance comes from Generative AI, a branch of AI that creates new content — including text, images, videos, and even financial models — based on learned patterns. Using Large Language Models (LLMs) and Generative Adversarial Networks (GANs), these systems simulate complex financial scenarios, improving fraud detection and stress testing.

For instance, PayPal and American Express use generative AI to simulate fraudulent transaction patterns, strengthening their security systems. Transformers — deep learning architectures behind tools like OpenAI’s GPT — enable these models to understand and generate human-like language, allowing them to summarize extensive reports, produce research briefs, and assist analysts in decision-making.

Yet, Generative AI also presents challenges. It can be manipulated through adversarial attacks, producing misleading or biased outputs if trained on flawed data. Ensuring transparency and fairness in training datasets remains critical to prevent discriminatory outcomes, especially in credit scoring and loan assessment.

AI and Automation: Revolutionizing Industry Operations

Artificial Intelligence has become a cornerstone of intelligent automation (IA), reshaping business process management (BPM) and robotic process automation (RPA). Traditional RPA handled repetitive, rule-based tasks, but with AI integration, these systems can now manage complex workflows that require contextual decision-making.

AI-driven automation enhances productivity, reduces operational costs, and increases accuracy. For example, in manufacturing, AI-enabled systems perform predictive maintenance by analyzing sensor data to detect machinery issues before failures occur, minimizing downtime and extending equipment lifespan.

In the automotive sector, AI-powered machine vision systems inspect car components with higher accuracy than human inspectors, ensuring consistent quality and safety. These innovations make automation not only efficient but also economically advantageous for large-scale industries.

Machine Learning: The Engine of Artificial Intelligence

At the heart of AI lies Machine Learning (ML) — algorithms that allow computers to learn from data and improve over time without explicit programming. Three fundamental ML models underpin most modern AI applications: Decision Trees, Linear Regression, and Logistic Regression.

Decision Trees

Decision trees simplify complex decision-making processes into intuitive, branching structures. Each branch represents a decision rule, and each leaf node gives an outcome or prediction. This makes them powerful tools for disease diagnosis in healthcare and credit risk assessment in finance. They handle both numerical and categorical data, offering transparency and interpretability.

Linear Regression

Linear regression models relationships between one dependent and one or more independent variables, making it useful for predictive analytics such as stock price forecasting. It applies mathematical techniques like Ordinary Least Squares (OLS) or Gradient Descent to optimize prediction accuracy. Its simplicity, efficiency, and scalability make it ideal for large datasets.

Logistic Regression

While linear regression predicts continuous outcomes, logistic regression is used for classification problems — determining whether an instance belongs to a particular category (e.g., yes/no, fraud/genuine). It calculates probabilities between 0 and 1 using a sigmoid function, providing fast and interpretable results. Logistic regression is widely used in healthcare (disease prediction) and finance (loan default assessment).

Types of Machine Learning Algorithms

Machine Learning can be broadly classified into Supervised, Unsupervised, and Reinforcement Learning — each suited for different problem types.

Supervised Learning

In supervised learning, algorithms train on labelled datasets to identify patterns and make predictions. Once trained, they can generalize to new, unseen data. Applications include spam filtering, voice recognition, and image classification.

Supervised models handle both classification (categorical predictions) and regression (continuous predictions). Their strength lies in high accuracy and reliability when trained on quality data.

Unsupervised Learning

Unsupervised learning, in contrast, deals with unlabelled data. It identifies hidden patterns or groupings within datasets, commonly used in customer segmentation, market basket analysis, and anomaly detection.

By autonomously discovering relationships, unsupervised learning reduces human bias and is valuable in exploratory data analysis.

Reinforcement Learning (Optional Expansion)

While not yet as mainstream, reinforcement learning trains algorithms through trial and error, rewarding desired outcomes. It is foundational in robotics, autonomous systems, and game AI — including the systems that now outperform humans in complex strategic games like Go or StarCraft.

Ethical and Societal Considerations of AI

Despite its transformative potential, AI raises significant ethical and privacy challenges. Issues such as algorithmic bias, data exploitation, and job displacement are increasingly at the forefront of public discourse.

Ethical AI demands transparent data practices, accountability in algorithm design, and equitable access to technology. Governments and academic institutions, including Capitol Technology University (captechu.edu), emphasize developing AI systems that align with social good, human rights, and sustainability.

Furthermore, the rise of generative AI has intensified debates about content authenticity, intellectual property, and deepfake misuse, underscoring the urgent need for comprehensive AI regulation.

A Technology Still in Transition

Artificial Intelligence stands at the intersection of opportunity and uncertainty. From Deep Blue’s deterministic algorithms to generative AI’s creative engines, the technology has redefined industries and continues to evolve at an unprecedented pace.

While self-aware AI and full cognitive autonomy remain theoretical, the rapid integration of AI across industries signals an irreversible shift toward machine-augmented intelligence. The challenge ahead is ensuring that this evolution remains ethical, inclusive, and sustainable — using AI to enhance human potential, not replace it.

References

Artificial Intelligence (AI) Market Size to Reach USD 3,680.47 Bn by 2034 – Precedence Research
What is AI, how does it work and what can it be used for? – BBC
10 Ways Companies Are Using AI in the Financial Services Industry – OneStream
The Transformative Power of AI: Impact on 18 Vital Industries – LinkedIn
What Is Artificial Intelligence (AI)? – IBM
The Ethical Considerations of Artificial Intelligence – Capitol Technology University
What is Generative AI? – Examples, Definition & Models – GeeksforGeeks

Math

Researchers Unveil Breakthrough in Efficient Machine Learning with Symmetric Data

Published

3 months ago

July 31, 2025

EP Staff

Illustrated image

MIT researchers have developed the first mathematically proven method for training machine learning models that can efficiently interpret symmetric data—an advance that could significantly enhance the accuracy and speed of AI systems in fields ranging from drug discovery to climate analysis.

In traditional drug discovery, for example, a human looking at a rotated image of a molecule can easily recognize it as the same compound. However, standard machine learning models may misclassify the rotated image as a completely new molecule, highlighting a blind spot in current AI approaches. This shortcoming stems from the concept of symmetry, where an object’s fundamental properties remain unchanged even when it undergoes transformations like rotation.

“If a drug discovery model doesn’t understand symmetry, it could make inaccurate predictions about molecular properties,” the researchers explained. While some empirical techniques have shown promise, there was previously no provably efficient way to train models that rigorously account for symmetry—until now.

“These symmetries are important because they are some sort of information that nature is telling us about the data, and we should take it into account in our machine-learning models. We’ve now shown that it is possible to do machine-learning with symmetric data in an efficient way,” said Behrooz Tahmasebi, MIT graduate student and co-lead author of the new study, in a media statement.

The research, recently presented at the International Conference on Machine Learning, is co-authored by fellow MIT graduate student Ashkan Soleymani (co-lead author), Stefanie Jegelka (associate professor of EECS, IDSS member, and CSAIL member), and Patrick Jaillet (Dugald C. Jackson Professor of Electrical Engineering and Computer Science and principal investigator at LIDS).

Rethinking how AI sees the world

Symmetric data appears across numerous scientific disciplines. For instance, a model capable of recognizing an object irrespective of its position in an image demonstrates such symmetry. Without built-in mechanisms to process these patterns, machine learning models can make more mistakes and require massive datasets for training. Conversely, models that leverage symmetry can work faster and with fewer data points.

“Graph neural networks are fast and efficient, and they take care of symmetry quite well, but nobody really knows what these models are learning or why they work. Understanding GNNs is a main motivation of our work, so we started with a theoretical evaluation of what happens when data are symmetric,” Tahmasebi noted.

The MIT researchers explored the trade-off between how much data a model needs and the computational effort required. Their resulting algorithm brings symmetry to the fore, allowing models to learn from fewer examples without spending excessive computing resources.

Blending algebra and geometry

The team combined strategies from both algebra and geometry, reformulating the problem so the machine learning model could efficiently process the inherent symmetries in the data. This innovative blend results in an optimization problem that is computationally tractable and requires fewer training samples.

“Most of the theory and applications were focusing on either algebra or geometry. Here we just combined them,” explained Tahmasebi.

By demonstrating that symmetry-aware training can be both accurate and efficient, the breakthrough paves the way for the next generation of neural network architectures, which promise to be more precise and less resource-intensive than conventional models.

“Once we know that better, we can design more interpretable, more robust, and more efficient neural network architectures,” added Soleymani.

This foundational advance is expected to influence future research in diverse applications, including materials science, astronomy, and climate modeling, wherever symmetry in data is a key feature.

Books

Humour, Humanity, and the Machine: A New Book Explores Our Comic Relationship with Technology

MIT scholar Benjamin Mangrum examines how comedy helps us cope with, critique, and embrace computing.

Published

3 months ago

July 12, 2025

EP Staff

Credit: Courtesy of Stanford University Press; Allegra Boverman

In a world increasingly shaped by algorithms, automation, and artificial intelligence, one unexpected tool continues to shape how we process technological change: comedy.

That’s the central argument of a thought-provoking new book by MIT literature professor Benjamin Mangrum, titled The Comedy of Computation: Or, How I Learned to Stop Worrying and Love Obsolescence, published this month by Stanford University Press. Drawing on literature, film, television, and theater, Mangrum explores how humor has helped society make sense of machines-and the humans who build and depend on them.

“Comedy makes computing feel less impersonal, less threatening,” Mangrum writes. “It allows us to bring something strange into our lives in a way that’s familiar, even pleasurable.”

From romantic plots to digital tensions

One of the book’s core insights is that romantic comedies-perhaps surprisingly-have been among the richest cultural spaces for grappling with our collective unease about technology. Mangrum traces this back to classic narrative structures, where characters who begin as obstacles eventually become partners in resolution. He suggests that computing often follows a similar arc in cultural storytelling.

“In many romantic comedies,” Mangrum explains, “there’s a figure or force that seems to stand in the way of connection. Over time, that figure is transformed and folded into the couple’s union. In tech narratives, computing sometimes plays this same role-beginning as a disruption, then becoming an ally.”

This structure, he notes, is centuries old-prevalent in Shakespearean comedies and classical drama-but it has found renewed relevance in the digital age.

Satirizing silicon dreams

In the book, Mangrum also explores what he calls the “Great Tech-Industrial Joke”-a mode of cultural humor aimed squarely at the inflated promises of the technology industry. Many of today’s comedies, from satirical shows like Silicon Valley to viral social media content, lampoon the gap between utopian tech rhetoric and underwhelming or problematic outcomes.

“Tech companies often announce revolutionary goals,” Mangrum observes, “but what we get is just slightly faster email. It’s a funny setup, but also a sharp critique.”

This dissonance, he argues, is precisely what makes tech such fertile ground for comedy. We live with machines that are both indispensable and, at times, disappointing. Humor helps bridge that contradiction.

The ethics of authenticity

Another recurring theme in The Comedy of Computation is the modern ideal of authenticity, and how computing complicates it. From social media filters to AI-generated content, questions about what’s “real” are everywhere-and comedy frequently calls out the performance.

“Comedy has always mocked pretension,” Mangrum says. “In today’s context, that often means jokes about curated digital lives or artificial intelligence mimicking human quirks.”

Messy futures, meaningful laughter

Ultimately, Mangrum doesn’t claim that comedy solves the challenges of computing-but he argues that it gives us a way to live with them.

“There’s this really complicated, messy picture,” he notes. “Comedy doesn’t always resolve it, but it helps us experience it, and sometimes, laugh through it.”

As we move deeper into an era of smart machines, digital identities, and algorithmic decision-making, Mangrum’s book reminds us that a well-placed joke might still be one of our most human responses.

(With inputs from MIT News)

Space & Physics5 months ago

Is Time Travel Possible? Exploring the Science Behind the Concept

Earth6 months ago

122 Forests, 3.2 Million Trees: How One Man Built the World’s Largest Miyawaki Forest

Space & Physics6 months ago

Did JWST detect “signs of life” in an alien planet?

Know The Scientist5 months ago

Narlikar – the rare Indian scientist who penned short stories

Society4 months ago

Shukla is now India’s first astronaut in decades to visit outer space

Society4 months ago

Axiom-4 will see an Indian astronaut depart for outer space after 41 years

Earth4 months ago

World Environment Day 2025: “Beating plastic pollution”

Society6 months ago

Rabies, Bites, and Policy Gaps: One Woman’s Humane Fight for Kerala’s Stray Dogs

EdPublica

Researchers Crack Open the ‘Black Box’ of Protein AI Models

Technology