Requirement Analysis

Overview

Initially, we evaluated the hardware requirements for running LLaMA:3.1-405B-fp16 and Deepseek-V3-671B-fp16 models. The hardware specifications were extremely high and unfeasible for our setup:

Given these impractical requirements, we opted for a more manageable model—LLaMA:3-8B-fp16.

New Model Decision: LLaMA:3-8B-fp16

After re-evaluating the project’s needs and considering available resources, we decided to use LLaMA:3-8B-fp16, which requires 16 GB of graphics memory. Additionally, the Facebook Zero Shot model requires 2 GB of VRAM, bringing the total requirement to 18 GB of VRAM.

GPU Decision: NVIDIA RTX 4090

To meet these new requirements, we chose the NVIDIA RTX 4090, which provides 24 GB of VRAM. This model not only covers our current needs but also leaves additional memory capacity for future scalability.

Why RTX 4090?

Next Steps