A server dedicated to running large language models (LLM) is available to computer science researchers, educators and students. Running OpenLLM, it can serve up to eight LLMs simultaneously, each with up to 48GB of VRAM. This is a new and experimental service.
Llama 3.1
The Llama 3.1 model (and here) with 8B parameters from Meta can be accessed at the following URL:
Deepseek Llama 3.1 8B
Limitations
llmserver
is only available from within the Computer Science network. You can access it from any wired machine (physical or virtual) within the department, or from the CS VPN.
The GPUs have 46GB of VRAM, which limits the size of LLM that can run on them.
This is a shared resource, so please behave yourself.
Hardware
The LLM server is located in the CIT data center and has the following specifications:
8 NVIDIA L40S Ada 48GB GPUs
1.5 TB RAM
Additional LLMs
We will entertain requests to run other LLMs. Send your request to problem@cs.brown.edu
.