Brown CS: Large Language Models

A server dedicated to running large language models (LLM) is available to computer science researchers, educators and students. Running OpenLLM, it can serve up to eight LLMs simultaneously, each with up to 48GB of VRAM. This is a new and experimental service.

Llama 3.1

The Llama 3.1 model (and here) with 8B parameters from Meta can be accessed at the following URL:

http://llmserver.cs.brown.edu:37090/chat

Deepseek Llama 3.1 8B

http://llmserver.cs.brown.edu:3090/chat

API doc for LLM models

http://llmserver.cs.brown.edu:37090

http://llmserver.cs.brown.edu:3090

Limitations

llmserver is only available from within the Computer Science network. You can access it from any wired machine (physical or virtual) within the department, or from the CS VPN.

The GPUs have 46GB of VRAM, which limits the size of LLM that can run on them.

This is a shared resource, so please behave yourself.

Hardware

The LLM server is located in the CIT data center and has the following specifications:

Dual AMD EPYC 9554 3.1GHz Processors with 64 cores each
8 NVIDIA L40S Ada 48GB GPUs
1.5 TB RAM

Additional LLMs

We will entertain requests to run other LLMs. Send your request to problem@cs.brown.edu.