Fall 2023 CS292F (Machine Learning on Graphs) course project
In this project, we introduced a new deep-learning framework to reconstruct images from human brain fMRI data using Latent Diffusion Models (LDM).
Our contributions are:
- We proposed four brain-to-image decoding neural network modules;
- We implemented a novel GCN-based module for brain decoding tasks;
- We adapted our architecture to two distinct datasets (NSD and THINGS-fMRI) and established new benchmarks for future studies.
Figure 1. Proposed framework overview, image adapted from Takagi & Nishimoto, 2023 and Lu et al., 2023.
🔗 For environment setup, please see: Implementation Details Section
📑 For further reading, please see: final_report.pdf
Reconstructing images from brain activity can provide valuable insights into neural coding mechanisms.
Recent works in brain-to-image tasks sometimes relied on having linear projections from fMRI features to pre-trained latent spaces, which may not fully capture the brain's nonlinear neural coding.
To address these gaps:
- We explored nonlinear architectures (CNN, VAE, GCN) for brain-to-image decoding.
- We incorporated LDM (Stable Diffusion) to reconstruct high-fidelity images from neural activity.
Inspired by Takagi & Nishimoto, 2023 and the understanding that the lower visual cortex is more relevant to low-level image features (e.g., edges and colors) while the higher visual cortex is associated with high-level semantic information, our reconstruction pipeline consists of two stages:
Stage 1:
- Map higher visual cortex fMRI activity to CLIP latent text embeddings.
- Map lower visual cortex fMRI activity to VQ-VAE latent image embeddings.
Stage 2:
- Generate images using LDM (Stable Diffusion) conditioned on mapped latent text and image features
- fMRI-to-text module
- CNN-based: residual Conv1D layers followed by 3 fully connected layers (inspired by Lin et al, 2022)
- fMRI-to-image modules
- CNN-based: Residual Conv1D layers followed by 3 fully connected layers
- VAE-based: Variational Autoencoder with two fully connected layers as the encoder and three FC-BN-LeakyReLU blocks as the decoder
- GCN-based:
- Two ChebConv layers with BatchNorm and ReLU
- To construct fMRI graph representation from raw fMRI signals:
- Neurons from visual cortex V1 and V2 were treated as two separate nodes, and V3 and V4 were combined into another node.
- Graph node's features were the corresponding ROI's normalized voxel activity.
- Graph edges were computed by the functional connectivity across nodes using Pearson's correlation coefficient.
Baseline: Takagi & Nishimoto, 2023
Nonlinear models significantly outperformed linear baselines in decoding fMRI to image and text latent spaces.
Our proposed CNN-based fMRI-to-text module and GCN-based fMRI-to-image module yielded the best reconstruction results, both qualitatively and quantitatively, in the NSD and THINGS-fMRI datasets.
Sample reconstructed images from NSD dataset:
Sample reconstructed images from THINGS-fMRI dataset:
All reconstructed images are available: Google Drive
Our work introduced four brain-to-stimuli decoding methods and showed the capability of nonlinear brain-inspired architectures in reconstructing images from fMRI data, providing potential insights into visual reconstructions for Brain-Computer Interface applications.
Create and activate conda environment named ldm
from environment_cs292.yml
cd cs292f
conda env create -f environment_cs292.yml
conda activate ldm
Install Stable Diffusion v1.4 (under the diffusion_sd1/ directory), download checkpoint (sd-v1-4.ckpt), and place it under the codes/diffusion_sd1/stable-diffusion/models/ldm/stable-diffusion-v1/
directory.
Note: I hard-coded some file paths, please do
grep -r '/hdd/yuchen'
and change file paths accordingly to make sure everything is stored in the intended location
generate_files.ipynb
: generating fMRI and image data files
roi_image_encoder.ipynb
: mapping low-level fMRI to image CLIP space using GCN
roi_text_encoder.ipynb
: mapping high-level fMRI to text CLIP space using GCN
evaluation.ipynb
: evaluation
📧 Yuchen Hou | GitHub | LinkedIn | Webpage
🚀 I'm always happy to chat about research ideas, potential collaborations, or anything you're passionate about!