Due to a variety of reasons, some business cline to employ less powerful yet open-sourced LLM in their AI (artificial intelligence) applications.
People making such decision usually want to save cost of calling API or have a distrust on the data protection facilities of AI service providers.
Multiable does not have any stand on adopting such approach. Anyway, let’s take a look on running cost of adopting a self-owned LLAMA, one of the most popular open-sourced LLM at the moment.
Many people commit a mistake when they plan to setup a ‘usable’ LLAMA in cloud, notably ignoring a range of cloud services necessary for production run.
Yup, may 1 out of 5 IT guys cline to use cost of an UAT environment to apply budget from management and then, things turn ugly when system live runs!!!
In fact, hosting a Large Language Model Architecture (LLAMA) in Amazon Web Services (AWS) involves several cost components associated with different AWS services.
- Amazon EC2 (Elastic Compute Cloud):
- Pricing depends on the instance type and configuration chosen. For hosting LLAMA, a GPU instance such as the p3.2xlarge is recommended for intensive machine learning tasks.
- p3.2xlarge Instance: Approx. USD3.06 per hour.
- p3.8xlarge Instance: Approx. USD12.24 per hour.
- Reserved Instances and Spot Instances can offer significant cost savings.
- Amazon S3 (Simple Storage Service):
- Used for storing datasets and model checkpoints.
- Standard Storage: USD0.023 per GB per month.
- Infrequent Access Storage: USD0.0125 per GB per month.
- Glacier Storage (for archived models): USD0.004 per GB per month.
- Amazon EBS (Elastic Block Store):
- Provides persistent block storage for use with EC2 instances.
- General Purpose SSD (gp2): USD0.10 per GB per month.
- Provisioned IOPS SSD (io1): Varies based on provisioned IOPS and storage size.
- Amazon VPC (Virtual Private Cloud):
- Networking costs may be incurred for data transfer between services.
- Data Transfer Out: First 1 GB per month is free, USD0.09 per GB for up to 10 TB per month.
- AWS Lambda:
- For any serverless functions required in processing.
- Lambda Functions: USD0.20 per 1 million requests, plus USD0.00001667 per GB-second of compute time.
- Amazon CloudWatch:
- Monitoring and logging services for the infrastructure.
- Custom Metrics: USD0.30 per metric per month.
- Logs: USD0.50 per GB ingested, USD0.03 per GB archived.
Identifying the complete annual cost for hosting and running a self-owned Language Learning Model Architecture (LLAMA) in AWS depends on several factors, including computing power, data storage, network transfer costs, and other ancillary services.
Compute: AWS provides various instances suitable for large language models, such as GPU-based EC2 instances. For example, using a p3.8xlarge instance, which costs approximately USD12.24 per hour, running continuously would average around USD107,136 annually.
Storage: Amazon S3 or EBS provides flexible storage options. High-performance EBS might cost about USD0.10 per GB-month. With an assumed need of 10 TB, storage costs might hover around USD12,000 annually.
Network Transfer: Data transfer costs vary but for significant data outputs and inputs, estimating a monthly charge of USD500 could result in USD6,000 annually.
Additional Services: Utilizing AWS Lambda, API Gateway, or other services can add another USD5,000 in auxiliary costs.
Here is a rough estimation. The total annual costs shall be around:
- Compute: USD107,136
- Storage: USD12,000
- Network Transfer: USD6,000
- Ancillary Services: USD5,000
Total Estimate: Approximately USD130,136 annually.
Please note that the above just covers cloud service costs charged by AWS. Labour costs involved are not mentioned and can vary a lot based on requirement of individual customer.
LAIDFU, a configurable enterprise AI Agent powered by no-code approach, allows user to employ different AI service provider in their application, ranging from OpenAI, Baidu to self-owned DeepSeek or LLAMA. User is free to pick the most appropriate LLM to run the user-defined use cases within various business processes.
About EDG Grant:
The Enterprise Development Grant (EDG) was launched in 2018. EDG is a single grant that supports companies in the upgrading of business capabilities, innovation and internationalization. EDG replaces the Capability Development Grant (CDG) and the Global Company Partnership (GCP) in the same year.
M18 ERP and M18 HCM is under the scopes of EDG grant.
Our consultants have rich experience in helping customers to get EDG application awarded to deploy our renowned M18 ERP in Singapore. Multiable will provide relevant technical / system related documentations and guide you throughout the application process
Contact us