Llama 3.1：开源语言模型的新里程碑_业界新闻

发布时间:2024-07-29 08:43

阅读量:0

大家好，最近AI领域的新闻接踵而至，让人目不暇接。就在7月（23号）晚上，Meta发布了Llama 3.1系列模型，包括8B、70B参数的版本，以及此前提到过的400B参数模型（实际上是405B）。根据Meta官方公布的数据，Llama 3.1 405B参数模型在某些测试数据集上甚至略微胜过了GPT-4。今天，让我们一起来了解Llama 3.1的更新内容，以及如何免费使用这个令人期待的模型。

https://ai.meta.com/blog/meta-llama-3-1/https://ai.meta.com/blog/meta-llama-3-1/我测试了它写代码能力，让它用python写个贪吃蛇游戏，写代码速度很慢，但是一次成功，下面是它写的代码和，代码运行的效果。

import pygame import sys import random  # Initialize Pygame pygame.init()  # Set up some constants WIDTH, HEIGHT = 800, 600 BLOCK_SIZE = 20 SPEED = 10  # Set up some colors BLACK = (0, 0, 0) WHITE = (255, 255, 255) RED = (255, 0, 0) GREEN = (0, 255, 0)  # Set up the display screen = pygame.display.set_mode((WIDTH, HEIGHT)) pygame.display.set_caption("Snake")  # Set up the font font = pygame.font.Font(None, 36)  # Set up the snake and food snake = [(200, 200), (220, 200), (240, 200)] food = (400, 300)  # Set up the direction direction = "RIGHT"  # Game loop while True:     # Event handling     for event in pygame.event.get():         if event.type == pygame.QUIT:             pygame.quit()             sys.exit()         elif event.type == pygame.KEYDOWN:             if event.key == pygame.K_UP and direction!= "DOWN":                 direction = "UP"             elif event.key == pygame.K_DOWN and direction!= "UP":                 direction = "DOWN"             elif event.key == pygame.K_LEFT and direction!= "RIGHT":                 direction = "LEFT"             elif event.key == pygame.K_RIGHT and direction!= "LEFT":                 direction = "RIGHT"      # Move the snake     head = snake[-1]     if direction == "UP":         new_head = (head[0], head[1] - BLOCK_SIZE)     elif direction == "DOWN":         new_head = (head[0], head[1] + BLOCK_SIZE)     elif direction == "LEFT":         new_head = (head[0] - BLOCK_SIZE, head[1])     elif direction == "RIGHT":         new_head = (head[0] + BLOCK_SIZE, head[1])     snake.append(new_head)      # Check for collision with food     if snake[-1] == food:         food = (random.randint(0, WIDTH - BLOCK_SIZE) // BLOCK_SIZE * BLOCK_SIZE,                 random.randint(0, HEIGHT - BLOCK_SIZE) // BLOCK_SIZE * BLOCK_SIZE)     else:         snake.pop(0)      # Check for collision with self or wall     if (snake[-1] in snake[:-1] or             snake[-1][0] < 0 or snake[-1][0] >= WIDTH or             snake[-1][1] < 0 or snake[-1][1] >= HEIGHT):         pygame.quit()         sys.exit()      # Draw everything     screen.fill(BLACK)     for pos in snake:         pygame.draw.rect(screen, GREEN, (pos[0], pos[1], BLOCK_SIZE, BLOCK_SIZE))     pygame.draw.rect(screen, RED, (food[0], food[1], BLOCK_SIZE, BLOCK_SIZE))     text = font.render(f"Score: {len(snake)}", True, WHITE)     screen.blit(text, (10, 10))     pygame.display.flip()      # Cap the frame rate     pygame.time.delay(1000 // SPEED)

运行效果如下：

Llama 3.1的主要更新

Context Window扩展：Llama 3.1将模型的Context Window从原来的8K tokens扩展到了128K tokens，极大地提升了处理长文和长对话的能力。
模型架构：Llama 3.1沿用了Llama 3的基础架构，即使是405B参数的模型也采用标准的decoder-only transformer架构，而非混合专家模型。同时继续使用GQA（分组查询注意力）技术，提高了长文处理能力。
性能表现：
- Llama 3.1 405B模型在多个测试数据集上超越了GPT-4o和Claude 3.5 Sonnet等顶级商业闭源模型。
- 8B参数版本优于参数相近的Gemma 2 9B IT和Mistral 7B Instruct。
- 70B参数版本不仅胜过开源模型Mixtral 8x22B，还在多项测试中大幅领先GPT-3.5 Turbo。
许可证更新：Meta更新了Llama 3.1的许可条款，允许使用模型输出来改进其他语言模型，但要求训练出的新模型名称必须以"Llama"开头，并标注"Built with Llama"。
指令微调：Llama 3.1的Instruct版本根据工具调用进行了微调，并引入了新的iPython角色来接收和记录工具调用返回的数据。