LLM-PySC2 项目介绍
Erstellt am: 5. Januar 2025
Erstellt am: 5. Januar 2025
请你理解下面的项目
<p align="center"> <img src="docs/figures/llm_smac_demo1.gif" width="225"> <img src="docs/figures/llm_smac_demo3.gif" width="225"> <img src="docs/figures/llm_smac_demo2.gif" width="225"><br/> <img src="docs/figures/llm_pysc2_demo1.gif" width="225"> <img src="docs/figures/llm_pysc2_demo2.gif" width="225"> <img src="docs/figures/llm_pysc2_demo3.gif" width="225"><br/> <img src="docs/figures/llm_pysc2_demo4.gif" width="225"> <img src="docs/figures/llm_pysc2_demo5.gif" width="225"><br/> <i>Demos of LLM-Pysc2 with SMAC tasks</i> </p>LLM-PySC2 is
NKU Robot Autonomy and Human-AI Collaboration Group and
NUDT Laboratory for Big Data and Decision's Python component of the StarCraft II LLM Decision Environment.
It wraps Deepmind's PySC2 Learning Environment API
in to a LLM energized Multi-Agent Decision Environment.
This is a collaboration between NKU and NUDT to develop StarCraft II into a rich environment for LLM research.
LLM-PySC2 provides an interface for LLM agents to interact with StarCraft 2,
getting textual/multimodal observations and dealing with textual actions.
We also make it possible for LLMs to make decisions in SMAC tasks, which make it possible to compare LLM with RL method.
If you use the LLM-PySC2 environment or LLM-SMAC tasks in your research,
please cite our github pager or
LLM StarCraft II Pre-Print Paper after preprint paper published.
You can also contact us by e-mail [email protected] or [email protected].
Arxiv paper: https://arxiv.org/abs/2411.05348 (Citation: https://ui.adsabs.harvard.edu/abs/2024arXiv241105348L/exportcitation)
LLM-PySC2 depends on the full StarCraft II game and only works with versions that
include the API, which is 3.16.1 and above.
Follow Blizzard's documentation to
get the linux version. By default, LLM-PySC2 expects the game to live in
~/StarCraftII/
. You can override this path by setting the SC2PATH
environment variable or creating your own run_config.
Install of the game as normal from Battle.net. Even the
Starter Edition will work.
If you used the default install location LLM-PySC2 should find the latest binary.
If you changed the install location, you might need to set the SC2PATH
environment variable with the correct location.
LLM-PySC2 should work on MacOS and Windows systems running Python 3.8+,
but has only been thoroughly tested on Linux. We welcome suggestions and patches
for better compatibility with other systems.
download the LLM-PySC2 code from our github page LLM-PySC2.
use pip install to initialize the environment:
shell$ conda create --name YOUR_ENV_NAME python==3.9 $ conda activate YOUR_ENV_NAME $ pip install -e .
you can use mirrors like pip install -e . -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
to speed up downloading.
We have placed the required maps in the project folder:
textllm_pysc2/maps/llm_pysc2 llm_pysc2/maps/llm_smac
You need to copy and paste these 2 folders
into the Maps folder of the StarCraft2 program. Generally, the folder path is:
C:\Program Files (x86)\StarCraft II\Maps
and finally looks like:
textC:\Program Files (x86)\StarCraft II\Maps\llm_pysc2 C:\Program Files (x86)\StarCraft II\Maps\llm_smac
If you used a custom path in installation, you may need to find the Map folder to finish the step.
If you do not know how to get api_key, you can contact us to obtain a temporary gpt-3.5-turbo api_key with 2M tokens for free.
You need to write your api_key in ./llm_pysc2/agents/configs/config.ProtossAgentConfig
before test the llm:
textclass ProtossAgentConfig(AgentConfig): def __init__(self): super(ProtossAgentConfig, self).__init__() self.race = 'protoss' self.model_name = 'gpt-3.5-turbo' self.api_base = 'YOUR_API_BASE' self.api_key = 'YOUR_API_KEY' ...
or set api_key like what we do in ./llm_pysc2/bin/experiment_llm_pysc2.py
:
config.reset_llm(model_name, api_base, api_key)
if you do not have api_key but still want to test the environment,
you can set config.LLM_SIMULATION_TIME = 5
to simulate a 5-second response large model
and continue the tutorial below.
After specify your LLM api_key, api_base and model_name, you can run our experiments to test LLM and
both the llm_pysc2 tasks and llm-smac tasks:
shell$ python -m llm_pysc2.bin.experiment_llm_pysc2 $ python -m llm_pysc2.bin.experiment_llm_smac
These two script will load gpt-3.5 energized agents and use pure text observation to make decisions.
If you want to use multimodal LLMs like gpt-4v, you can set config.ENABLE_IMAGE_RGB = True
to
activate image observations.
Also, you can use --parallel parameter (or edit files in ./llm_pysc2/bin) to run several games at the same time:
shell$ python -m pysc2.bin.agent --map pvz_task4_level1 --agent_race protoss --parallel 2 --agent llm_pysc2.bin.experiment_llm_pysc2.MainAgentLLMPysc2 $ python -m pysc2.bin.agent --map pvz_task4_level1 --agent_race protoss --parallel 4 --agent llm_pysc2.bin.experiment_llm_pysc2.MainAgentLLMPysc2
which may significantly improve experimental efficiency.
We provided two series of experiment tasks:
text(1) llm_pysc2 experiments: a series of pvz combat, with 8 different settings and 3 levels of difficult for each. (2) llm_smac experiments: same as original SMAC tasks, units control by LLM agent instead of RL agent.
You can run these experiments in ./llm_pysc2/bin/llm_pysc2
and ./llm_pysc2/bin/llm_smac
.
llm_smac experiments are original smac tasks, use the same map and setting of smac tasks. Consider that
the control of the Zerg and Terran is still ongoing, we will add more smac tasks in the future
(executable files in ./llm_pysc2/bin/llm_smac
):
text2s3z 3s5z 1c3s5z 3s_vs_3z 3s_vs_4z 3s_vs_5z 2c_vs_64zg 2s_vs_1sc 3s5z_vs_3s6z (more to be added in future version)
llm_pysc2 experiments contains 6 pvz combat with 3 level of difficulties, concentrate more attention to larger
scenarios and the use of unit skills, provide higher complexity and operability:
Task | Description | Text-Input | Image-Input | Multi-Agent | Communication |
---|---|---|---|---|---|
task1 | 2 Adept harass zerg base | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task2 | 3 Phoenix harass zerg base | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task3 | 6 Stalker intercept airdrops | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task4 | 12 Stalkers vs many zerg units | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task5 | 2c3d4s12s vs many zerg units | ✔ | ✔ (can be disabled) | ✔ (Isomorphic) | ✖ (can be enabled) |
task6 | 1a6h4s12s vs many zerg units | ✔ | ✔ (can be disabled) | ✔ (Isomorphic) | ✖ (can be enabled) |
task7 | 1m3c3t6v12s vs many zerg units | ✔ | ✔ (can be disabled) | ✔ (Heterogeneous) | ✔ (can be disabled) |
task8 | 2wp8bg12s multiline attack | ✔ | ✔ (can be disabled) | ✔ (Heterogeneous) | ✔ (can be disabled) |
Each of them has three different difficulties. Simply, level-1 can be used for demo and method debug;
level-2 is a standard difficulty, can be used for policy training, and level-3 can be used as an experimental field
for trained method evaluating, which is quite difficult, and can be served as a good OOD evaluating environment.
More details can be seen in ./docs/llm_pysc2/experiments.md
.
If you want to redefine a SubAgent's interaction process with the large model, you can redefine the query function of
a SubAgent. View relevant code of class Customized_LLMAgent(LLMAgent)
in ./llm_pysc2/agents/llm_pysc2_agent.py
:
textclass Customized_LLMAgent(LLMAgent) def query(self, obs) -> None:
Main agent is used to interact with pysc2 and does not directly make decisions, so it is not recommended to modify it.
You can affect the main agent by modifying the config just like what we do in.
textllm_pysc2/bin/experiment_llm_pysc2.py llm_pysc2/bin/experiment_llm_smac.py
MainAgent is only an objects used for scheduling cameras, managing internal data and interacting with the env.
While SubAgents(of the MainAgent) query llm to obtain text actions, plays the role of decision maker.
SubAgent is configured in configurations(llm_pysc2/agents/configs).
For example, ./llm_pysc2/agents/configs/llm_smac/config_2s3z.py
defines a MainAgent with only one sub agent named
CombatGroupSmac
:
textclass ConfigSmac_2s3z(ProtossAgentConfig): def __init__(self): super(ConfigSmac_2s3z, self).__init__() ... self.AGENTS = { 'CombatGroupSmac': { 'describe': "Protoss military commander, controls units to fight against enemy. ", 'llm': {...}, 'team': [ {'name': 'Zealot-1', 'unit_type': [units.Protoss.Zealot], 'game_group': 1, 'select_type': 'group'}, {'name': 'Zealot-2', 'unit_type': [units.Protoss.Zealot], 'game_group': 2, 'select_type': 'group'}, {'name': 'Stalker-1', 'unit_type': [units.Protoss.Stalker], 'game_group': 4, 'select_type': 'group'}, ], 'action': { units.Protoss.Zealot: PROTOSS_BASIC_ACTION_SMAC, units.Protoss.Stalker: PROTOSS_BASIC_ACTION_SMAC, }, }, } ...
you can add more sub agent in your configuration if needed.
You can design agent's unit team in config.
In the example above, we showed a sub agent with 3 teams called 'Zealot-1', 'Zealot-2' and 'Stalker-1'.
You can redefine your agent teams if needed.
A team should consist of:
text'name': any str 'unit_type': list, of pysc2 unit enum 'game_group': int, -1 to 9, -1 refers to do not add units to any game_group 'select_type': str, 'group' for group recall, 'select' for single select(mouse left click), 'select_all_type' for select screen units of same types(mouse double click)
More examples can be viewed in ./llm_pysc2/agents/configs
.
You can design team's action space in config.
In the example(ConfigSmac_2s3z) above, we showed a action space of CombatGroupSmac.
Where PROTOSS_BASIC_ACTION_SMAC
shape as:
[{'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},]
and PROTOSS_BASIC_ACTION_SMAC2
shape as:
text[{'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]}, {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]},]
Each action should consist of three parts:
text'name': any str 'arg': list, either of [] / ['tag'] / ['screen'] / ['minimap'] / ['tag', 'screen'] / ['tag', 'minimap'] 'func': a list of triplet (pysc2_func_id, pysc2_func, args_type)
More examples can be viewed in ./llm_pysc2/agents/configs
and ./llm_pysc2/lib/llm_actions
.
You can design your own Observation Wrapper by redefining o_translator
in ./llm_pysc2/lib/llm_observations.py
.
You can design your own Text Action Recognizer by redefining a_translator
in ./llm_pysc2/lib/llm_actions.py
.
You can design your own LLM Client in llm_pysc2/lib/llm_client
if needed.
We have planed to try our best to add the following features before 2025/2/1
:
text(1) Support of Zerg control (2) Support of Terran control (3) Compatibility of classic RL algorithms
And more features before 2025/5/1
:
text(1) Full game experiments series1: in map Simple64, Simple96, Simple128 (2) Full game experiments series2: in map Ancient Cistern LE, Babylon LE, Gresvan LE
Note that this LLM-PySC2 is a preview version, and official version with far more features and better stability
will be open-sourced before 2025/7/1
.
Zongyuan Li (Main Contributor, Nankai University):
Framework design. Multi-Agent structure. Text observation generation. Text action recognition.
Communication. Experiments design. Prompt. Logger. Documents. Organize.
Runnan Qi, Yanan Ni, Lumin Jiang (National University of Defense Technology):
LLM Client. Multimodal LLM Client. Image observation generation. Documents.
Chang Lu, Xiaojie Xu, Pengfei Li, Yunzheng Guo, Zhe Ma (Nankai University):
Data recorder. Experiments evaluation. Game knowledge.
Kuihua Huang (National University of Defense Technology), Xian Guo, Xuebo Zhang(Nankai University):
Organize.
https://github.com/oxwhirl/smac
https://github.com/google-deepmind/pysc2
https://github.com/histmeisah/Large-Language-Models-play-StarCraftII
config.py内容
from llm_pysc2.lib.llm_action import PROTOSS_ACTION_BUILD,
PROTOSS_BASIC_ACTION_1, PROTOSS_BASIC_ACTION_2, PROTOSS_BASIC_ACTION_3,
PROTOSS_ACTION_WARPTRAIN, PROTOSS_ACTION_TRAIN, PROTOSS_ACTION_RESEARCH, F
from llm_pysc2.lib.llm_client import vision_model_names #, video_model_names
from pysc2.lib import units
from loguru import logger
import time
def wait(second, log_id, more_info=''):
for i in range(5):
logger.warning(f"[ID {log_id}] Experiment will start with UNSAFE settings in {5 - i} seconds. {more_info}")
time.sleep(1)
class AgentConfig:
def init(self):
textself.race = 'protoss' self.model_name = 'gpt-3.5-turbo' # 'gpt-3.5-turbo' self.api_base = 'https://hk.xty.app/v1' # 'https://hk.xty.app/v1' self.api_key = 'sk-6QeWXN15vgwIC1ptCf6e9eFb323f4dF5Bd8c6a0b4746654f' # 'xxxxxxxxxxxxxxxxxxxxxxxx....' self.temperature = 0.1 self.basic_prompt = 'default' self.translator_o = 'default' self.translator_a = 'default' self.communicator = 'default' self.ENABLE_INIT_STEPS = True self.ENABLE_AUTO_WORKER_MANAGE = True self.ENABLE_AUTO_WORKER_TRAINING = True self.ENABLE_COMMUNICATION = False self.ENABLE_IMAGE_RGB = False self.ENABLE_IMAGE_FEATURE = False self.ENABLE_SAVE_IMAGES = True self.LLM_SIMULATION_TIME = 0 self.MAX_LLM_QUERY_TIMES = 5 self.MAX_LLM_WAITING_TIME = 15 self.MAX_LLM_RUNTIME_ERROR_TIME = 45 self.MAX_LLM_DECISION_FREQUENCY = 1 self.MAX_NUM_ACTIONS = 3 self.AGENTS = [] self.AGENTS_ALWAYS_DISABLE = []
def reset_llm(self, model_name=None, api_base=None, api_key=None, ENABLE_IMAGE_RGB=None, ENABLE_IMAGE_FEATURE=None):
if model_name is not None and model_name != 'YOUR-MODEL-NAME':
self.model_name = model_name
if api_base is not None and api_base != 'YOUR-API-BASE':
self.api_base = api_base
if api_key is not None and api_key != 'YOUR-API-KEY':
self.api_key = api_key
if ENABLE_IMAGE_RGB is not None:
self.ENABLE_IMAGE_RGB = ENABLE_IMAGE_RGB
if ENABLE_IMAGE_FEATURE is not None:
self.ENABLE_IMAGE_FEATURE = ENABLE_IMAGE_FEATURE
if ENABLE_IMAGE_RGB is True and ENABLE_IMAGE_FEATURE is True:
raise AssertionError("Do not support ENABLE_IMAGE_RGB and ENABLE_IMAGE_FEATURE at the same time, currently")
for agent_name in self.AGENTS.keys():
self.AGENTS[agent_name]['llm']['model_name'] = self.model_name
self.AGENTS[agent_name]['llm']['api_base'] = self.api_base
self.AGENTS[agent_name]['llm']['api_key'] = self.api_key
if self.ENABLE_IMAGE_RGB:
self.AGENTS[agent_name]['llm']['img_rgb'] = True
self.AGENTS[agent_name]['llm']['img_fea'] = False
elif self.ENABLE_IMAGE_FEATURE:
self.AGENTS[agent_name]['llm']['img_rgb'] = False
self.AGENTS[agent_name]['llm']['img_fea'] = True
else:
self.AGENTS[agent_name]['llm']['img_rgb'] = False
self.AGENTS[agent_name]['llm']['img_fea'] = False
def auto_check(self, log_id):
if not isinstance(self.LLM_SIMULATION_TIME, (int, float)) or self.LLM_SIMULATION_TIME <= 0:
error_in_llm_setting = False
if self.model_name == '' or self.model_name == 'YOUR-MODEL-NAME':
self.reset_llm(model_name='gpt-3.5-turbo')
logger.error(f"[ID {log_id}] No model_name set, please specify model_name in the config.")
self.LLM_SIMULATION_TIME = 5
error_in_llm_setting = True
if self.api_key == '' or self.api_key == 'YOUR-API-KEY':
logger.error(f"[ID {log_id}] No api_key set, please specify your api_key in the config.")
self.LLM_SIMULATION_TIME = 5
error_in_llm_setting = True
if self.model_name == '' or self.api_key == '':
self.LLM_SIMULATION_TIME = 5
error_in_llm_setting = True
if error_in_llm_setting:
wait(5, log_id, "(in LLM SIMULATION MODE)")
textif self.ENABLE_IMAGE_RGB or self.ENABLE_IMAGE_FEATURE: if self.ENABLE_IMAGE_RGB and self.ENABLE_IMAGE_FEATURE: logger.error(f"[ID {log_id}] can not enable config.ENABLE_IMAGE_RGB and config.ENABLE_IMAGE_FEATURE together.") AssertionError(f"config.ENABLE_IMAGE_RGB and config.ENABLE_IMAGE_FEATURE can not be True together") if self.model_name not in vision_model_names: logger.error(f"[ID {log_id}] config.ENABLE_IMAGE_RGB/FEATURE with large models that do not support images.") wait(5, log_id) if self.model_name in vision_model_names: logger.warning(f"[ID {log_id}] You are using a vision model with image obs, this may cost a lot, be cautious.") wait(5, log_id) else: if self.model_name in vision_model_names: logger.warning(f"[ID {log_id}] You are using a vision avaliable model without using any image obs.") wait(5, log_id)
class ProtossAgentConfig(AgentConfig):
def init(self):
super(ProtossAgentConfig, self).init()
text# Program control parameters in class AgentConfig (above) self.AGENTS_ALWAYS_DISABLE = [] self.AGENTS = { 'Airborne': { 'describe': "Protoss airborne commander, controls units airborne/warptrain from WarpPrism. " "Responsible for quick reinforcing nearby units or executing multiline combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Airborne-Zealot-1', 'unit_type': [units.Protoss.Zealot], 'game_group': -1, 'select_type': 'select_all_type'}, # , 'max_unit_num': {units.Protoss.Zealot: -1} ], 'action': { units.Protoss.Zealot: PROTOSS_BASIC_ACTION_2, }, }, 'Builder': { 'describe': "Protoss builder, controls several Probe. Responsible for build buildings", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Builder-Probe-1', 'unit_type': [units.Protoss.Probe], 'game_group': -1, 'select_type': 'select'}, # , 'max_unit_num': {units.Protoss.Probe: -1} ], 'action': { units.Protoss.Probe: PROTOSS_BASIC_ACTION_2 + PROTOSS_ACTION_BUILD, }, }, 'Commander': { 'describe': "Protoss military supreme commander. " "Responsible for making macro decision through communication, and controls nexus for massrecall " "for tactical objectives. When make deployment, describe the time, location, and objectives of the " "mission as clearly as possible", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': 'commander', 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Empty', 'unit_type': [], 'game_group': -1, 'select_type': 'select'}, ], 'action': { 'EmptyGroup': [], }, }, 'Developer': { 'describe': "Protoss logistics commander. " "Responsible for unit trainning, unit warp trainning, technology upgrade and order the Builder " "to build.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': 'developer', 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'WarpGate-1', 'unit_type': [units.Protoss.WarpGate], 'game_group': -1, 'select_type': 'select_all_type'}, # , 'max_unit_num': {units.Protoss.WarpGate: -1} {'name': 'Empty', 'unit_type': [], 'game_group': -1, 'select_type': 'select'}, ], 'action': { units.Protoss.WarpGate: PROTOSS_ACTION_WARPTRAIN, 'EmptyGroup': PROTOSS_BASIC_ACTION_1 + PROTOSS_ACTION_RESEARCH + PROTOSS_ACTION_TRAIN + [ {'name': 'Stop_Building_Unit', 'arg': ['tag'], 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (454, F.Stop_Building_quick, ('queued'))]} ], }, }, 'Defender': { 'describe': "Protoss garrison troops commander, controls several Stalkers. " "Responsible for intercepting enemy infiltrating forces.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ # {'name': 'Nexus', 'unit_type': [units.Protoss.Nexus], # 'game_group': -1, 'select_type': 'select', 'max_unit_num': {units.Protoss.Nexus: 8}}, {'name': 'Stalker-1', 'unit_type': [units.Protoss.Stalker], 'game_group': 1, 'select_type': 'group'}, # , 'max_unit_num': {units.Protoss.Stalker: 8} ], 'action': { # units.Protoss.Nexus: [ # {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, # {'name': 'Select_Workers_Attack_Screen', 'arg': ['screen'], # 'func': [(12, F.Attack_screen, ('queued', 'screen'))]}, # {'name': 'Select_Workers_Move_Screen', 'arg': ['screen'], # 'func': [(331, F.Move_screen, ('queued', 'screen'))]}, # ], units.Protoss.Stalker: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (180, F.Effect_Blink_screen, ('now', 'screen'))]}, ] }, }, 'CombatGroup0': { 'describe': "Protoss frontline commander, controls several Zealots. " "Responsible for providing cover for the main force and executing multi line combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Zealot-1', 'unit_type': [units.Protoss.Zealot], 'game_group': 2, 'select_type': 'group'}, # , 'max_unit_num': {units.Protoss.Zealot: -1}, -1 for unlimited {'name': 'Zealot-2', 'unit_type': [units.Protoss.Zealot], 'game_group': 3, 'select_type': 'group'}, # , 'max_unit_num': {units.Protoss.Zealot: -1}, -1 for unlimited ], 'action': { units.Protoss.Zealot: PROTOSS_BASIC_ACTION_2, }, }, 'CombatGroup1': { 'describe': "Protoss frontline commander, controls several Stalkers. " "Responsible for providing cover for the main force and restraining enemy forces.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Stalker-1', 'unit_type': [units.Protoss.Stalker], 'game_group': 4, 'select_type': 'group'}, {'name': 'Stalker-2', 'unit_type': [units.Protoss.Stalker], 'game_group': 5, 'select_type': 'group'}, {'name': 'Stalker-3', 'unit_type': [units.Protoss.Stalker], 'game_group': 6, 'select_type': 'group'}, ], 'action': { units.Protoss.Stalker: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (180, F.Effect_Blink_screen, ('now', 'screen'))]}, ] }, }, 'CombatGroup2': { 'describe': "Protoss frontline commander, controls ground main force such as Immortal, Colossus and Archon. " "Responsible for frontal combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Immortal-1', 'unit_type': [units.Protoss.Immortal], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Immortal-2', 'unit_type': [units.Protoss.Immortal], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Colossus-1', 'unit_type': [units.Protoss.Colossus], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Colossus-2', 'unit_type': [units.Protoss.Colossus], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Archon-1', 'unit_type': [units.Protoss.Archon], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Archon-2', 'unit_type': [units.Protoss.Archon], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported ], 'action': { units.Protoss.Immortal: PROTOSS_BASIC_ACTION_2, units.Protoss.Colossus: PROTOSS_BASIC_ACTION_2, units.Protoss.Archon: PROTOSS_BASIC_ACTION_2, }, }, 'CombatGroup3': { 'describe': "Protoss frontline commander, controls air main force such as VoidRay, Carrier and Tempest. " "Responsible for frontal combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'VoidRay-1', 'unit_type': [units.Protoss.VoidRay], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'VoidRay-2', 'unit_type': [units.Protoss.VoidRay], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Carrier-1', 'unit_type': [units.Protoss.Carrier], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Carrier-2', 'unit_type': [units.Protoss.Carrier], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Tempest-1', 'unit_type': [units.Protoss.Tempest], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Tempest-2', 'unit_type': [units.Protoss.Tempest], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported ], 'action': { units.Protoss.Carrier: PROTOSS_BASIC_ACTION_2, units.Protoss.Tempest: PROTOSS_BASIC_ACTION_2, units.Protoss.VoidRay: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_PrismaticAlignment', 'arg': [], 'func': [(244, F.Effect_VoidRayPrismaticAlignment_quick, ('queued'))]}, ], }, }, 'CombatGroup4': { 'describe': "Protoss reconnaissance commander, controls Observer and several Probe. " "Responsible for providing reconnaissance infomation and detect cloak unit for main force", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Probe', 'unit_type': [units.Protoss.Probe], 'game_group': -1, 'select_type': 'select'}, {'name': 'Observer', 'unit_type': [units.Protoss.Observer, units.Protoss.ObserverSurveillanceMode], 'game_group': -1, 'select_type': 'select'}, ], 'action': { units.Protoss.Probe: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Lock_Nexus_Near', 'arg': ['tag'], 'func': [(70, F.Build_Pylon_screen, ('queued', 'screen_tag'))]}, {'name': 'Lock_Assimilator_Near', 'arg': ['tag'], 'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]}, ], units.Protoss.Observer: PROTOSS_BASIC_ACTION_3 + [ {'name': 'Morph_SurveillanceMode', 'arg': [], 'func': [(538, F.Morph_SurveillanceMode_quick, ('queued'))]}, ], units.Protoss.ObserverSurveillanceMode: [ {'name': 'Continuously_Monitor_Here', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Morph_ObserverMode', 'arg': [], 'func': [(535, F.Morph_ObserverMode_quick, ('queued'))]}, ], }, }, 'CombatGroup5': { 'describe': "Protoss AOE commander, controls HighTemplar and Disruptor. " "Responsible for dealing high damage to clustered enemies", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'HighTemplar-1', 'unit_type': [units.Protoss.HighTemplar], 'game_group': 7, 'select_type': 'group'}, {'name': 'Disruptor-1', 'unit_type': [units.Protoss.Disruptor], 'game_group': 8, 'select_type': 'group'}, ], 'action': { units.Protoss.HighTemplar: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_PsiStorm_Screen', 'arg': ['screen'], 'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen'))]}, {'name': 'Ability_PsiStorm_Attack_Unit', 'arg': ['tag'], 'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen_tag'))]}, {'name': 'Morph_Archon', 'arg': [], 'func': [(296, F.Morph_Archon_quick, ('queued'))]}, {'name': 'Select_Two_Unit_Morph_Archon', 'arg': ['tag', 'tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (3, F.select_rect, ('add', 'screen1_tag2', 'screen2_tag2')), (296, F.Morph_Archon_quick, ('queued'))]}, ], units.Protoss.Disruptor: PROTOSS_BASIC_ACTION_3 + [ {'name': 'Ability_PurificationNova_Attack_Unit', 'arg': ['tag'], 'func': [(219, F.Effect_PurificationNova_screen, ('queued', 'screen_tag'))]}, ], # units.Protoss.DisruptorPhased: PROTOSS_BASIC_ACTION_2, }, }, 'CombatGroup6': { 'describe': "Protoss tactical support commander, controls Sentry and Mothership. " "Responsible for providing tactical support by using skills", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Sentry-1', 'unit_type': [units.Protoss.Sentry], 'game_group': 9, 'select_type': 'group'}, {'name': 'Mothership', 'unit_type': [units.Protoss.Mothership], 'game_group': -1, 'select_type': 'select'} ], 'action': { units.Protoss.Sentry: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_ForceField_Screen', 'arg': ['screen'], 'func': [(193, F.Effect_ForceField_screen, ('queued', 'screen'))]}, {'name': 'Ability_GuardianShield', 'arg': [], 'func': [(197, F.Effect_GuardianShield_quick, ('queued'))]}, # # Hallucination not supported in pysc2 # {'name': 'Hallucination_Adept', 'arg': [], # 'func': [(248, F.Hallucination_Adept_quick, ('queued'))]}, # {'name': 'Hallucination_Archon', 'arg': [], # 'func': [(249, F.Hallucination_Archon_quick, ('queued'))]}, # {'name': 'Hallucination_Colossus', 'arg': [], # 'func': [(250, F.Hallucination_Colossus_quick, ('queued'))]}, # {'name': 'Hallucination_Disruptor', 'arg': [], # 'func': [(251, F.Hallucination_Disruptor_quick, ('queued'))]}, # {'name': 'Hallucination_HighTemplar', 'arg': [], # 'func': [(252, F.Hallucination_HighTemplar_quick, ('queued'))]}, # {'name': 'Hallucination_Immortal', 'arg': [], # 'func': [(253, F.Hallucination_Immortal_quick, ('queued'))]}, # {'name': 'Hallucination_Oracle', 'arg': [], # 'func': [(254, F.Hallucination_Oracle_quick, ('queued'))]}, # {'name': 'Hallucination_Phoenix', 'arg': [], # 'func': [(255, F.Hallucination_Phoenix_quick, ('queued'))]}, # {'name': 'Hallucination_Probe', 'arg': [], # 'func': [(256, F.Hallucination_Probe_quick, ('queued'))]}, # {'name': 'Hallucination_Stalker', 'arg': [], # 'func': [(257, F.Hallucination_Stalker_quick, ('queued'))]}, # {'name': 'Hallucination_VoidRay', 'arg': [], # 'func': [(258, F.Hallucination_VoidRay_quick, ('queued'))]}, # {'name': 'Hallucination_WarpPrism', 'arg': [], # 'func': [(259, F.Hallucination_WarpPrism_quick, ('queued'))]}, # {'name': 'Hallucination_Zealot', 'arg': [], # 'func': [(260, F.Hallucination_Zealot_quick, ('queued'))]}, ], units.Protoss.Mothership: PROTOSS_BASIC_ACTION_3 + [ # Ability_CloakingField not supported in pysc2 # Ability_MothershipMassRecall not neccessary in simple combat tasks # {'name': 'Ability_MothershipMassRecall_Near', 'arg': ['tag'], # 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (208, F.Effect_MassRecall_screen, ('queued', 'screen_tag'))]}, {'name': 'Ability_TimeWarp_Attack', 'arg': ['tag'], 'func': [(241, F.Effect_TimeWarp_screen, ('queued', 'screen_tag'))]}, {'name': 'Ability_TimeWarp_Screen', 'arg': ['screen'], 'func': [(241, F.Effect_TimeWarp_screen, ('queued', 'screen'))]}, ], }, }, 'CombatGroup7': { 'describe': "Protoss special force commander, controls Adept and DarkTemplar. " "Responsible for infiltrating the enemy's rear and disrupt economic production, sometimes " "collecting reconnaissance infomation, participating in frontline combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Adept-1', 'unit_type': [units.Protoss.Adept], 'game_group': -1, 'select_type': 'select_all_type'}, {'name': 'AdeptPhase-1', 'unit_type': [units.Protoss.AdeptPhaseShift], 'game_group': -1, 'select_type': 'select_all_type'}, {'name': 'DarkTemplar-1', 'unit_type': [units.Protoss.DarkTemplar], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'DarkTemplar-2', 'unit_type': [units.Protoss.DarkTemplar], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than one select_all_type not currently supported ], 'action': { units.Protoss.AdeptPhaseShift: PROTOSS_BASIC_ACTION_3, units.Protoss.Adept: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_AdeptPhaseShift_Screen', 'arg': ['screen'], 'func': [(177, F.Effect_AdeptPhaseShift_screen, ('queued', 'screen'))]}, {'name': 'Ability_AdeptPhaseShift_Minimap', 'arg': ['minimap'], 'func': [(547, F.Effect_AdeptPhaseShift_minimap, ('queued', 'minimap'))]}, {'name': 'Ability_CancelPhaseShift', 'arg': [], 'func': [(141, F.Cancel_AdeptPhaseShift_quick, ('queued'))]}, ], units.Protoss.DarkTemplar: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_ShadowStride_Unit', 'arg': ['tag'], 'func': [(182, F.Effect_ShadowStride_screen, ('queued', 'screen_tag'))]}, {'name': 'Morph_Archon', 'arg': [], 'func': [(296, F.Morph_Archon_quick, ('queued'))]}, {'name': 'Select_Two_Unit_Morph_Archon', 'arg': ['tag', 'tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (3, F.select_rect, ('add', 'screen1_tag2', 'screen2_tag2')), # screen1/2_tag2 not realized yet (296, F.Morph_Archon_quick, ('queued'))]}, ], }, }, 'CombatGroup8': { 'describe': "Protoss air special force commander, controls Oracle and Phoenix. " "Responsible for infiltrating the enemy's rear and disrupt economic production, sometimes " "collecting reconnaissance infomation, participating in frontline combat, or build StasisTrap " "to block the enemy's main force.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Oracle-1', 'unit_type': [units.Protoss.Oracle], 'game_group': -1, 'select_type': 'select_all_type'}, {'name': 'Phoenix-1', 'unit_type': [units.Protoss.Phoenix], 'game_group': -1, 'select_type': 'select_all_type'}, ], 'action': { units.Protoss.Oracle: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_PulsarBeamOn', 'arg': [], 'func': [(38, F.Behavior_PulsarBeamOn_quick, ('queued'))]}, {'name': 'Ability_OracleRevelation_Screen', 'arg': ['screen'], 'func': [(214, F.Effect_OracleRevelation_screen, ('queued', 'screen'))]}, {'name': 'Build_StasisTrap_Screen', 'arg': ['screen'], 'func': [(90, F.Build_StasisTrap_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Ability_PulsarBeamOn', 'arg': ['tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (38, F.Behavior_PulsarBeamOn_quick, ('queued'))]}, {'name': 'Select_Unit_OracleRevelation_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (214, F.Effect_OracleRevelation_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Build_StasisTrap_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (90, F.Build_StasisTrap_screen, ('queued', 'screen'))]}, ], units.Protoss.Phoenix: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_GravitonBeam_Unit', 'arg': ['tag'], 'func': [(196, F.Effect_GravitonBeam_screen, ('queued', 'screen_tag'))]}, {'name': 'Select_Unit_Ability_GravitonBeam_Unit', 'arg': ['tag', 'tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (196, F.Effect_GravitonBeam_screen, ('queued', 'screen_tag2'))]}, ], }, }, 'CombatGroup9': { 'describe': "Protoss airborne commander, controls WarpPrism and airborne units like Zealots, Stalkers." "Responsible for supplement troops on the front line, or executing multi line combat. " "Keep stability as much as possible in WarpRismPhashing mode to provide stable power field for " "unit warpping.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'WarpPrism', 'unit_type': [units.Protoss.WarpPrism, units.Protoss.WarpPrismPhasing], 'game_group': -1, 'select_type': 'select'}, ], 'action': { units.Protoss.WarpPrism: PROTOSS_BASIC_ACTION_3 + [ {'name': 'Morph_WarpPrismPhasingMode', 'arg': [], 'func': [(329, F.Morph_WarpPrismPhasingMode_quick, ('queued'))]}, {'name': 'Load_Unit', 'arg': ['tag'], 'func': [(287, F.Load_screen, ('queued', 'screen_tag'))]}, {'name': 'Unload_Screen', 'arg': ['screen'], 'func': [(516, F.UnloadAllAt_screen, ('queued', 'screen'))]}, ], units.Protoss.WarpPrismPhasing: [ {'name': 'Wait_For_Unit_Warp', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Morph_WarpPrismTransportMode', 'arg': [], 'func': [(330, F.Morph_WarpPrismTransportMode_quick, ('queued'))]}, ], }, }, }
class TerranAgentConfig(AgentConfig):
def init(self):
super(TerranAgentConfig, self).init()
class ZergAgentConfig(AgentConfig):
def init(self):
super(ZergAgentConfig, self).init()
下面给出一个对该项目的整体理解与介绍,供你参考。若有任何需要更详细或更特定方向的说明,欢迎进一步提问。
LLM-PySC2 是一个基于 Python 的星际争霸 II(StarCraft II)多智能体决策环境,用来整合大型语言模型(LLM)与 StarCraft II 环境(DeepMind 提供的 PySC2)的交互。它由南开大学机器人自主与人机协同实验室(NKU Robot Autonomy and Human-AI Collaboration Group)和国防科技大学大数据与决策实验室(NUDT Laboratory for Big Data and Decision)联合开发。项目的核心目标,是为研究者提供一个能够将 LLM 的文本/多模态决策方式与星际争霸 II RTS 游戏相结合的实验平台。
项目主要包含两个部分:
这使得研究者可以让 LLM 接收文本或图像等多模态观察(如屏幕特征图像、地图要素信息等),并通过文本指令的形式向 StarCraft II 环境发起行动,从而实现各种星际争霸 II 场景下的策略生成、多智能体协作和对战。
主流 LLM 支持
支持 GPT-3.5、GPT-4v(多模态)等至少 5 个系列的大模型,可根据需要切换或替换为其他自定义 LLM Client。
宏观决策与微操
LLM 能够做战略层面的宏观决策,同时可直接下发对具体单位的微操指令(如“Blink”、“使用闪电风暴技能”等)。
多智能体协作
支持多智能体(甚至异质 Agent)之间的点对点通信、领域通信,也可分别给不同兵种或功能的 SubAgent 设置不同的 LLM Prompt 与决策逻辑。
丰富的环境与任务
多线程并行
可一次运行多个对战线程,对于需要大规模实验的研究十分方便。
多样化观察
完整操作空间
支持几乎所有的星际争霸 II 单位技能与操作,包括建筑、科技、技能、道具等。
自动经济管理
提供对造农民、分农民采矿、基础建造等自动化管理的能力,便于重点研究战术与策略。
高质量日志与数据记录
所有的对战过程包括 LLM 的决策输出、游戏画面等,都可被完整地记录下来,方便复现与分析。
获取 StarCraft II
~/StarCraftII/
。SC2PATH
环境变量。获取 LLM-PySC2
bashconda create --name YOUR_ENV_NAME python==3.9 conda activate YOUR_ENV_NAME pip install -e .
地图安装
llm_pysc2/maps/llm_pysc2
和 llm_pysc2/maps/llm_smac
两个文件夹复制到 StarCraft II 安装目录的 Maps
文件夹下。textC:\Program Files (x86)\StarCraft II\Maps\llm_pysc2 C:\Program Files (x86)\StarCraft II\Maps\llm_smac
LLM API Key
./llm_pysc2/agents/configs/config.ProtossAgentConfig
或在运行脚本(如 experiment_llm_pysc2.py
)的 config.reset_llm()
中设置 model_name
、api_base
和 api_key
。config.LLM_SIMULATION_TIME
设置为一个大于 0 的值(模拟大模型思考时延),以便仅测试环境流程。单场景测试
bashpython -m llm_pysc2.bin.experiment_llm_pysc2 python -m llm_pysc2.bin.experiment_llm_smac
分别测试 llm_pysc2 与 llm_smac 任务,默认会加载 gpt-3.5-turbo 做文本决策。
并行运行
bashpython -m pysc2.bin.agent --map pvz_task4_level1 --agent_race protoss --parallel 2 --agent llm_pysc2.bin.experiment_llm_pysc2.MainAgentLLMPysc2
或者 --parallel 4
以同时开启更多局的对战,提升实验效率。
2s3z
、3s5z
、3s_vs_5z
、1c3s5z
、2c_vs_64zg
等,多数是虫族、神族、或人族单位的混编小规模战斗。task1 ~ task8
,每个任务又包含 3 个难度等级(level-1、level-2、level-3):
task4
是 12 Stalkers vs many zerg units
,难度越高则虫族越强或地图要素更多。自定义 LLM 交互流程
llm_pysc2/agents/llm_pysc2_agent.py
中重写 LLMAgent.query()
方法,以实现自定义 Prompt 构造、消息格式处理等。自定义主/子 Agent
llm_pysc2/agents/configs
文件夹下进行配置(如 ProtossAgentConfig)。自定义观察与动作翻译器
llm_pysc2/lib/llm_observations.py
中可定制 o_translator
(观察翻译,决定给 LLM 的文本如何组织)。llm_pysc2/lib/llm_actions.py
中可定制 a_translator
(动作翻译,负责解析 LLM 的输出文本并映射为 pysc2 行为 ID/参数)。自定义 LLM Client
llm_pysc2/lib/llm_client
中进行扩展。2025/2/1 之前:
2025/5/1 之前:
2025/7/1 之前:
南开大学:
国防科技大学:
此外,感谢 SMAC 团队和 DeepMind 的开源项目、以及部分参考项目 Large-Language-Models-play-StarCraftII。
LLM-PySC2 致力于让大模型能够在星际争霸 II 中做多智能体决策,并提供了多种复杂任务场景(如 PvZ 作战、SMAC 小队对战)以及一套可扩展的多模态观测与文本交互框架。它对于希望探索:
都具有重要的研究价值。本项目目前仍在快速迭代,后续还将加入更多完整对战地图、更丰富的单位技能与经济管理、对虫族与人族的支持、以及与经典 RL 算法的可共存或对比接口。
如果有任何问题,可在 GitHub Issues 进行讨论,或通过文档中给出的邮箱直接与开发者联系。
我是个小白,帮助我理解config.py的内容
from llm_pysc2.lib.llm_action import PROTOSS_ACTION_BUILD,
PROTOSS_BASIC_ACTION_1, PROTOSS_BASIC_ACTION_2, PROTOSS_BASIC_ACTION_3,
PROTOSS_ACTION_WARPTRAIN, PROTOSS_ACTION_TRAIN, PROTOSS_ACTION_RESEARCH, F
from llm_pysc2.lib.llm_client import vision_model_names #, video_model_names
from pysc2.lib import units
from loguru import logger
import time
def wait(second, log_id, more_info=''):
for i in range(5):
logger.warning(f"[ID {log_id}] Experiment will start with UNSAFE settings in {5 - i} seconds. {more_info}")
time.sleep(1)
class AgentConfig:
def init(self):
textself.race = 'protoss' self.model_name = 'gpt-3.5-turbo' # 'gpt-3.5-turbo' self.api_base = 'https://hk.xty.app/v1' # 'https://hk.xty.app/v1' self.api_key = 'sk-6QeWXN15vgwIC1ptCf6e9eFb323f4dF5Bd8c6a0b4746654f' # 'xxxxxxxxxxxxxxxxxxxxxxxx....' self.temperature = 0.1 self.basic_prompt = 'default' self.translator_o = 'default' self.translator_a = 'default' self.communicator = 'default' self.ENABLE_INIT_STEPS = True self.ENABLE_AUTO_WORKER_MANAGE = True self.ENABLE_AUTO_WORKER_TRAINING = True self.ENABLE_COMMUNICATION = False self.ENABLE_IMAGE_RGB = False self.ENABLE_IMAGE_FEATURE = False self.ENABLE_SAVE_IMAGES = True self.LLM_SIMULATION_TIME = 0 self.MAX_LLM_QUERY_TIMES = 5 self.MAX_LLM_WAITING_TIME = 15 self.MAX_LLM_RUNTIME_ERROR_TIME = 45 self.MAX_LLM_DECISION_FREQUENCY = 1 self.MAX_NUM_ACTIONS = 3 self.AGENTS = [] self.AGENTS_ALWAYS_DISABLE = []
def reset_llm(self, model_name=None, api_base=None, api_key=None, ENABLE_IMAGE_RGB=None, ENABLE_IMAGE_FEATURE=None):
if model_name is not None and model_name != 'YOUR-MODEL-NAME':
self.model_name = model_name
if api_base is not None and api_base != 'YOUR-API-BASE':
self.api_base = api_base
if api_key is not None and api_key != 'YOUR-API-KEY':
self.api_key = api_key
if ENABLE_IMAGE_RGB is not None:
self.ENABLE_IMAGE_RGB = ENABLE_IMAGE_RGB
if ENABLE_IMAGE_FEATURE is not None:
self.ENABLE_IMAGE_FEATURE = ENABLE_IMAGE_FEATURE
if ENABLE_IMAGE_RGB is True and ENABLE_IMAGE_FEATURE is True:
raise AssertionError("Do not support ENABLE_IMAGE_RGB and ENABLE_IMAGE_FEATURE at the same time, currently")
for agent_name in self.AGENTS.keys():
self.AGENTS[agent_name]['llm']['model_name'] = self.model_name
self.AGENTS[agent_name]['llm']['api_base'] = self.api_base
self.AGENTS[agent_name]['llm']['api_key'] = self.api_key
if self.ENABLE_IMAGE_RGB:
self.AGENTS[agent_name]['llm']['img_rgb'] = True
self.AGENTS[agent_name]['llm']['img_fea'] = False
elif self.ENABLE_IMAGE_FEATURE:
self.AGENTS[agent_name]['llm']['img_rgb'] = False
self.AGENTS[agent_name]['llm']['img_fea'] = True
else:
self.AGENTS[agent_name]['llm']['img_rgb'] = False
self.AGENTS[agent_name]['llm']['img_fea'] = False
def auto_check(self, log_id):
if not isinstance(self.LLM_SIMULATION_TIME, (int, float)) or self.LLM_SIMULATION_TIME <= 0:
error_in_llm_setting = False
if self.model_name == '' or self.model_name == 'YOUR-MODEL-NAME':
self.reset_llm(model_name='gpt-3.5-turbo')
logger.error(f"[ID {log_id}] No model_name set, please specify model_name in the config.")
self.LLM_SIMULATION_TIME = 5
error_in_llm_setting = True
if self.api_key == '' or self.api_key == 'YOUR-API-KEY':
logger.error(f"[ID {log_id}] No api_key set, please specify your api_key in the config.")
self.LLM_SIMULATION_TIME = 5
error_in_llm_setting = True
if self.model_name == '' or self.api_key == '':
self.LLM_SIMULATION_TIME = 5
error_in_llm_setting = True
if error_in_llm_setting:
wait(5, log_id, "(in LLM SIMULATION MODE)")
textif self.ENABLE_IMAGE_RGB or self.ENABLE_IMAGE_FEATURE: if self.ENABLE_IMAGE_RGB and self.ENABLE_IMAGE_FEATURE: logger.error(f"[ID {log_id}] can not enable config.ENABLE_IMAGE_RGB and config.ENABLE_IMAGE_FEATURE together.") AssertionError(f"config.ENABLE_IMAGE_RGB and config.ENABLE_IMAGE_FEATURE can not be True together") if self.model_name not in vision_model_names: logger.error(f"[ID {log_id}] config.ENABLE_IMAGE_RGB/FEATURE with large models that do not support images.") wait(5, log_id) if self.model_name in vision_model_names: logger.warning(f"[ID {log_id}] You are using a vision model with image obs, this may cost a lot, be cautious.") wait(5, log_id) else: if self.model_name in vision_model_names: logger.warning(f"[ID {log_id}] You are using a vision avaliable model without using any image obs.") wait(5, log_id)
class ProtossAgentConfig(AgentConfig):
def init(self):
super(ProtossAgentConfig, self).init()
text# Program control parameters in class AgentConfig (above) self.AGENTS_ALWAYS_DISABLE = [] self.AGENTS = { 'Airborne': { 'describe': "Protoss airborne commander, controls units airborne/warptrain from WarpPrism. " "Responsible for quick reinforcing nearby units or executing multiline combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Airborne-Zealot-1', 'unit_type': [units.Protoss.Zealot], 'game_group': -1, 'select_type': 'select_all_type'}, # , 'max_unit_num': {units.Protoss.Zealot: -1} ], 'action': { units.Protoss.Zealot: PROTOSS_BASIC_ACTION_2, }, }, 'Builder': { 'describe': "Protoss builder, controls several Probe. Responsible for build buildings", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Builder-Probe-1', 'unit_type': [units.Protoss.Probe], 'game_group': -1, 'select_type': 'select'}, # , 'max_unit_num': {units.Protoss.Probe: -1} ], 'action': { units.Protoss.Probe: PROTOSS_BASIC_ACTION_2 + PROTOSS_ACTION_BUILD, }, }, 'Commander': { 'describe': "Protoss military supreme commander. " "Responsible for making macro decision through communication, and controls nexus for massrecall " "for tactical objectives. When make deployment, describe the time, location, and objectives of the " "mission as clearly as possible", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': 'commander', 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Empty', 'unit_type': [], 'game_group': -1, 'select_type': 'select'}, ], 'action': { 'EmptyGroup': [], }, }, 'Developer': { 'describe': "Protoss logistics commander. " "Responsible for unit trainning, unit warp trainning, technology upgrade and order the Builder " "to build.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': 'developer', 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'WarpGate-1', 'unit_type': [units.Protoss.WarpGate], 'game_group': -1, 'select_type': 'select_all_type'}, # , 'max_unit_num': {units.Protoss.WarpGate: -1} {'name': 'Empty', 'unit_type': [], 'game_group': -1, 'select_type': 'select'}, ], 'action': { units.Protoss.WarpGate: PROTOSS_ACTION_WARPTRAIN, 'EmptyGroup': PROTOSS_BASIC_ACTION_1 + PROTOSS_ACTION_RESEARCH + PROTOSS_ACTION_TRAIN + [ {'name': 'Stop_Building_Unit', 'arg': ['tag'], 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (454, F.Stop_Building_quick, ('queued'))]} ], }, }, 'Defender': { 'describe': "Protoss garrison troops commander, controls several Stalkers. " "Responsible for intercepting enemy infiltrating forces.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ # {'name': 'Nexus', 'unit_type': [units.Protoss.Nexus], # 'game_group': -1, 'select_type': 'select', 'max_unit_num': {units.Protoss.Nexus: 8}}, {'name': 'Stalker-1', 'unit_type': [units.Protoss.Stalker], 'game_group': 1, 'select_type': 'group'}, # , 'max_unit_num': {units.Protoss.Stalker: 8} ], 'action': { # units.Protoss.Nexus: [ # {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, # {'name': 'Select_Workers_Attack_Screen', 'arg': ['screen'], # 'func': [(12, F.Attack_screen, ('queued', 'screen'))]}, # {'name': 'Select_Workers_Move_Screen', 'arg': ['screen'], # 'func': [(331, F.Move_screen, ('queued', 'screen'))]}, # ], units.Protoss.Stalker: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (180, F.Effect_Blink_screen, ('now', 'screen'))]}, ] }, }, 'CombatGroup0': { 'describe': "Protoss frontline commander, controls several Zealots. " "Responsible for providing cover for the main force and executing multi line combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Zealot-1', 'unit_type': [units.Protoss.Zealot], 'game_group': 2, 'select_type': 'group'}, # , 'max_unit_num': {units.Protoss.Zealot: -1}, -1 for unlimited {'name': 'Zealot-2', 'unit_type': [units.Protoss.Zealot], 'game_group': 3, 'select_type': 'group'}, # , 'max_unit_num': {units.Protoss.Zealot: -1}, -1 for unlimited ], 'action': { units.Protoss.Zealot: PROTOSS_BASIC_ACTION_2, }, }, 'CombatGroup1': { 'describe': "Protoss frontline commander, controls several Stalkers. " "Responsible for providing cover for the main force and restraining enemy forces.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Stalker-1', 'unit_type': [units.Protoss.Stalker], 'game_group': 4, 'select_type': 'group'}, {'name': 'Stalker-2', 'unit_type': [units.Protoss.Stalker], 'game_group': 5, 'select_type': 'group'}, {'name': 'Stalker-3', 'unit_type': [units.Protoss.Stalker], 'game_group': 6, 'select_type': 'group'}, ], 'action': { units.Protoss.Stalker: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (180, F.Effect_Blink_screen, ('now', 'screen'))]}, ] }, }, 'CombatGroup2': { 'describe': "Protoss frontline commander, controls ground main force such as Immortal, Colossus and Archon. " "Responsible for frontal combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Immortal-1', 'unit_type': [units.Protoss.Immortal], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Immortal-2', 'unit_type': [units.Protoss.Immortal], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Colossus-1', 'unit_type': [units.Protoss.Colossus], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Colossus-2', 'unit_type': [units.Protoss.Colossus], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Archon-1', 'unit_type': [units.Protoss.Archon], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Archon-2', 'unit_type': [units.Protoss.Archon], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported ], 'action': { units.Protoss.Immortal: PROTOSS_BASIC_ACTION_2, units.Protoss.Colossus: PROTOSS_BASIC_ACTION_2, units.Protoss.Archon: PROTOSS_BASIC_ACTION_2, }, }, 'CombatGroup3': { 'describe': "Protoss frontline commander, controls air main force such as VoidRay, Carrier and Tempest. " "Responsible for frontal combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'VoidRay-1', 'unit_type': [units.Protoss.VoidRay], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'VoidRay-2', 'unit_type': [units.Protoss.VoidRay], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Carrier-1', 'unit_type': [units.Protoss.Carrier], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Carrier-2', 'unit_type': [units.Protoss.Carrier], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported {'name': 'Tempest-1', 'unit_type': [units.Protoss.Tempest], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'Tempest-2', 'unit_type': [units.Protoss.Tempest], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than 1 select_all_type not currently supported ], 'action': { units.Protoss.Carrier: PROTOSS_BASIC_ACTION_2, units.Protoss.Tempest: PROTOSS_BASIC_ACTION_2, units.Protoss.VoidRay: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_PrismaticAlignment', 'arg': [], 'func': [(244, F.Effect_VoidRayPrismaticAlignment_quick, ('queued'))]}, ], }, }, 'CombatGroup4': { 'describe': "Protoss reconnaissance commander, controls Observer and several Probe. " "Responsible for providing reconnaissance infomation and detect cloak unit for main force", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Probe', 'unit_type': [units.Protoss.Probe], 'game_group': -1, 'select_type': 'select'}, {'name': 'Observer', 'unit_type': [units.Protoss.Observer, units.Protoss.ObserverSurveillanceMode], 'game_group': -1, 'select_type': 'select'}, ], 'action': { units.Protoss.Probe: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Lock_Nexus_Near', 'arg': ['tag'], 'func': [(70, F.Build_Pylon_screen, ('queued', 'screen_tag'))]}, {'name': 'Lock_Assimilator_Near', 'arg': ['tag'], 'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]}, ], units.Protoss.Observer: PROTOSS_BASIC_ACTION_3 + [ {'name': 'Morph_SurveillanceMode', 'arg': [], 'func': [(538, F.Morph_SurveillanceMode_quick, ('queued'))]}, ], units.Protoss.ObserverSurveillanceMode: [ {'name': 'Continuously_Monitor_Here', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Morph_ObserverMode', 'arg': [], 'func': [(535, F.Morph_ObserverMode_quick, ('queued'))]}, ], }, }, 'CombatGroup5': { 'describe': "Protoss AOE commander, controls HighTemplar and Disruptor. " "Responsible for dealing high damage to clustered enemies", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'HighTemplar-1', 'unit_type': [units.Protoss.HighTemplar], 'game_group': 7, 'select_type': 'group'}, {'name': 'Disruptor-1', 'unit_type': [units.Protoss.Disruptor], 'game_group': 8, 'select_type': 'group'}, ], 'action': { units.Protoss.HighTemplar: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_PsiStorm_Screen', 'arg': ['screen'], 'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen'))]}, {'name': 'Ability_PsiStorm_Attack_Unit', 'arg': ['tag'], 'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen_tag'))]}, {'name': 'Morph_Archon', 'arg': [], 'func': [(296, F.Morph_Archon_quick, ('queued'))]}, {'name': 'Select_Two_Unit_Morph_Archon', 'arg': ['tag', 'tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (3, F.select_rect, ('add', 'screen1_tag2', 'screen2_tag2')), (296, F.Morph_Archon_quick, ('queued'))]}, ], units.Protoss.Disruptor: PROTOSS_BASIC_ACTION_3 + [ {'name': 'Ability_PurificationNova_Attack_Unit', 'arg': ['tag'], 'func': [(219, F.Effect_PurificationNova_screen, ('queued', 'screen_tag'))]}, ], # units.Protoss.DisruptorPhased: PROTOSS_BASIC_ACTION_2, }, }, 'CombatGroup6': { 'describe': "Protoss tactical support commander, controls Sentry and Mothership. " "Responsible for providing tactical support by using skills", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Sentry-1', 'unit_type': [units.Protoss.Sentry], 'game_group': 9, 'select_type': 'group'}, {'name': 'Mothership', 'unit_type': [units.Protoss.Mothership], 'game_group': -1, 'select_type': 'select'} ], 'action': { units.Protoss.Sentry: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_ForceField_Screen', 'arg': ['screen'], 'func': [(193, F.Effect_ForceField_screen, ('queued', 'screen'))]}, {'name': 'Ability_GuardianShield', 'arg': [], 'func': [(197, F.Effect_GuardianShield_quick, ('queued'))]}, # # Hallucination not supported in pysc2 # {'name': 'Hallucination_Adept', 'arg': [], # 'func': [(248, F.Hallucination_Adept_quick, ('queued'))]}, # {'name': 'Hallucination_Archon', 'arg': [], # 'func': [(249, F.Hallucination_Archon_quick, ('queued'))]}, # {'name': 'Hallucination_Colossus', 'arg': [], # 'func': [(250, F.Hallucination_Colossus_quick, ('queued'))]}, # {'name': 'Hallucination_Disruptor', 'arg': [], # 'func': [(251, F.Hallucination_Disruptor_quick, ('queued'))]}, # {'name': 'Hallucination_HighTemplar', 'arg': [], # 'func': [(252, F.Hallucination_HighTemplar_quick, ('queued'))]}, # {'name': 'Hallucination_Immortal', 'arg': [], # 'func': [(253, F.Hallucination_Immortal_quick, ('queued'))]}, # {'name': 'Hallucination_Oracle', 'arg': [], # 'func': [(254, F.Hallucination_Oracle_quick, ('queued'))]}, # {'name': 'Hallucination_Phoenix', 'arg': [], # 'func': [(255, F.Hallucination_Phoenix_quick, ('queued'))]}, # {'name': 'Hallucination_Probe', 'arg': [], # 'func': [(256, F.Hallucination_Probe_quick, ('queued'))]}, # {'name': 'Hallucination_Stalker', 'arg': [], # 'func': [(257, F.Hallucination_Stalker_quick, ('queued'))]}, # {'name': 'Hallucination_VoidRay', 'arg': [], # 'func': [(258, F.Hallucination_VoidRay_quick, ('queued'))]}, # {'name': 'Hallucination_WarpPrism', 'arg': [], # 'func': [(259, F.Hallucination_WarpPrism_quick, ('queued'))]}, # {'name': 'Hallucination_Zealot', 'arg': [], # 'func': [(260, F.Hallucination_Zealot_quick, ('queued'))]}, ], units.Protoss.Mothership: PROTOSS_BASIC_ACTION_3 + [ # Ability_CloakingField not supported in pysc2 # Ability_MothershipMassRecall not neccessary in simple combat tasks # {'name': 'Ability_MothershipMassRecall_Near', 'arg': ['tag'], # 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (208, F.Effect_MassRecall_screen, ('queued', 'screen_tag'))]}, {'name': 'Ability_TimeWarp_Attack', 'arg': ['tag'], 'func': [(241, F.Effect_TimeWarp_screen, ('queued', 'screen_tag'))]}, {'name': 'Ability_TimeWarp_Screen', 'arg': ['screen'], 'func': [(241, F.Effect_TimeWarp_screen, ('queued', 'screen'))]}, ], }, }, 'CombatGroup7': { 'describe': "Protoss special force commander, controls Adept and DarkTemplar. " "Responsible for infiltrating the enemy's rear and disrupt economic production, sometimes " "collecting reconnaissance infomation, participating in frontline combat.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Adept-1', 'unit_type': [units.Protoss.Adept], 'game_group': -1, 'select_type': 'select_all_type'}, {'name': 'AdeptPhase-1', 'unit_type': [units.Protoss.AdeptPhaseShift], 'game_group': -1, 'select_type': 'select_all_type'}, {'name': 'DarkTemplar-1', 'unit_type': [units.Protoss.DarkTemplar], 'game_group': -1, 'select_type': 'select_all_type'}, # {'name': 'DarkTemplar-2', 'unit_type': [units.Protoss.DarkTemplar], # 'game_group': -1, 'select_type': 'select_all_type'}, # more than one select_all_type not currently supported ], 'action': { units.Protoss.AdeptPhaseShift: PROTOSS_BASIC_ACTION_3, units.Protoss.Adept: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_AdeptPhaseShift_Screen', 'arg': ['screen'], 'func': [(177, F.Effect_AdeptPhaseShift_screen, ('queued', 'screen'))]}, {'name': 'Ability_AdeptPhaseShift_Minimap', 'arg': ['minimap'], 'func': [(547, F.Effect_AdeptPhaseShift_minimap, ('queued', 'minimap'))]}, {'name': 'Ability_CancelPhaseShift', 'arg': [], 'func': [(141, F.Cancel_AdeptPhaseShift_quick, ('queued'))]}, ], units.Protoss.DarkTemplar: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_ShadowStride_Unit', 'arg': ['tag'], 'func': [(182, F.Effect_ShadowStride_screen, ('queued', 'screen_tag'))]}, {'name': 'Morph_Archon', 'arg': [], 'func': [(296, F.Morph_Archon_quick, ('queued'))]}, {'name': 'Select_Two_Unit_Morph_Archon', 'arg': ['tag', 'tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (3, F.select_rect, ('add', 'screen1_tag2', 'screen2_tag2')), # screen1/2_tag2 not realized yet (296, F.Morph_Archon_quick, ('queued'))]}, ], }, }, 'CombatGroup8': { 'describe': "Protoss air special force commander, controls Oracle and Phoenix. " "Responsible for infiltrating the enemy's rear and disrupt economic production, sometimes " "collecting reconnaissance infomation, participating in frontline combat, or build StasisTrap " "to block the enemy's main force.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'Oracle-1', 'unit_type': [units.Protoss.Oracle], 'game_group': -1, 'select_type': 'select_all_type'}, {'name': 'Phoenix-1', 'unit_type': [units.Protoss.Phoenix], 'game_group': -1, 'select_type': 'select_all_type'}, ], 'action': { units.Protoss.Oracle: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_PulsarBeamOn', 'arg': [], 'func': [(38, F.Behavior_PulsarBeamOn_quick, ('queued'))]}, {'name': 'Ability_OracleRevelation_Screen', 'arg': ['screen'], 'func': [(214, F.Effect_OracleRevelation_screen, ('queued', 'screen'))]}, {'name': 'Build_StasisTrap_Screen', 'arg': ['screen'], 'func': [(90, F.Build_StasisTrap_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Ability_PulsarBeamOn', 'arg': ['tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (38, F.Behavior_PulsarBeamOn_quick, ('queued'))]}, {'name': 'Select_Unit_OracleRevelation_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (214, F.Effect_OracleRevelation_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Build_StasisTrap_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (90, F.Build_StasisTrap_screen, ('queued', 'screen'))]}, ], units.Protoss.Phoenix: PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_GravitonBeam_Unit', 'arg': ['tag'], 'func': [(196, F.Effect_GravitonBeam_screen, ('queued', 'screen_tag'))]}, {'name': 'Select_Unit_Ability_GravitonBeam_Unit', 'arg': ['tag', 'tag'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (196, F.Effect_GravitonBeam_screen, ('queued', 'screen_tag2'))]}, ], }, }, 'CombatGroup9': { 'describe': "Protoss airborne commander, controls WarpPrism and airborne units like Zealots, Stalkers." "Responsible for supplement troops on the front line, or executing multi line combat. " "Keep stability as much as possible in WarpRismPhashing mode to provide stable power field for " "unit warpping.", 'llm': { 'basic_prompt': self.basic_prompt, 'translator_o': self.translator_o, 'translator_a': self.translator_a, 'img_fea': self.ENABLE_IMAGE_FEATURE, 'img_rgb': self.ENABLE_IMAGE_RGB, 'model_name': self.model_name, 'api_base': self.api_base, 'api_key': self.api_key, }, 'team': [ {'name': 'WarpPrism', 'unit_type': [units.Protoss.WarpPrism, units.Protoss.WarpPrismPhasing], 'game_group': -1, 'select_type': 'select'}, ], 'action': { units.Protoss.WarpPrism: PROTOSS_BASIC_ACTION_3 + [ {'name': 'Morph_WarpPrismPhasingMode', 'arg': [], 'func': [(329, F.Morph_WarpPrismPhasingMode_quick, ('queued'))]}, {'name': 'Load_Unit', 'arg': ['tag'], 'func': [(287, F.Load_screen, ('queued', 'screen_tag'))]}, {'name': 'Unload_Screen', 'arg': ['screen'], 'func': [(516, F.UnloadAllAt_screen, ('queued', 'screen'))]}, ], units.Protoss.WarpPrismPhasing: [ {'name': 'Wait_For_Unit_Warp', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Morph_WarpPrismTransportMode', 'arg': [], 'func': [(330, F.Morph_WarpPrismTransportMode_quick, ('queued'))]}, ], }, }, }
class TerranAgentConfig(AgentConfig):
def init(self):
super(TerranAgentConfig, self).init()
class ZergAgentConfig(AgentConfig):
def init(self):
super(ZergAgentConfig, self).init()
下面这份 config.py
文件主要用于配置和管理 LLM-PySC2 环境中“神族”(Protoss)Agent 的各种参数,包括要使用的大语言模型(LLM)信息、多智能体(子 Agent)的设置、各子 Agent 能执行的行为空间以及环境交互过程中的一些基本规则等。为了帮助你理清各部分的功能与作用,我将从整体结构到每个细节依次进行说明。
这个文件包含了多个类和一些辅助函数:
wait()
函数:一个简单的等待函数,主要用于打印日志,并在出现风险配置(如未设置 API Key)时,让用户有 5 秒钟可中断或确认的时间。class AgentConfig
:这是一个通用的 Agent 配置基类,里面包含了大多数通用的配置项,比如 LLM 的地址、API Key、是否启用图像、最大动作次数、对 LLM 的超时限制等等。class ProtossAgentConfig(AgentConfig)
:这是一个针对“神族”单位的具体配置类,继承了 AgentConfig
,并在此基础上定义了大量子 Agent(如“Builder”、“Defender”、“CombatGroupX” 等),每个子 Agent 都包含描述信息、可控制的单位以及可执行的动作列表。class TerranAgentConfig(AgentConfig)
/ class ZergAgentConfig(AgentConfig)
:类结构尚在开发中,后续会对“人族”和“虫族”提供类似的配置。AgentConfig
基类详解pythonclass AgentConfig: def __init__(self): self.race = 'protoss' self.model_name = 'gpt-3.5-turbo' self.api_base = 'https://hk.xty.app/v1' self.api_key = 'sk-6QeW...' self.temperature = 0.1 self.basic_prompt = 'default' self.translator_o = 'default' self.translator_a = 'default' self.communicator = 'default' self.ENABLE_INIT_STEPS = True self.ENABLE_AUTO_WORKER_MANAGE = True self.ENABLE_AUTO_WORKER_TRAINING = True self.ENABLE_COMMUNICATION = False self.ENABLE_IMAGE_RGB = False self.ENABLE_IMAGE_FEATURE = False self.ENABLE_SAVE_IMAGES = True self.LLM_SIMULATION_TIME = 0 self.MAX_LLM_QUERY_TIMES = 5 self.MAX_LLM_WAITING_TIME = 15 self.MAX_LLM_RUNTIME_ERROR_TIME = 45 self.MAX_LLM_DECISION_FREQUENCY = 1 self.MAX_NUM_ACTIONS = 3 self.AGENTS = [] self.AGENTS_ALWAYS_DISABLE = []
self.race
:表示默认的种族,这里是 'protoss'
,即神族。
self.model_name
:要使用的大模型名称,比如 'gpt-3.5-turbo'
。
self.api_base
与 self.api_key
:对应大模型调用的接口地址和密钥。
self.temperature
:生成式模型的随机性参数,一般在 0 ~ 2
之间,越大表示生成越发散,越小越保守。
self.basic_prompt
/ self.translator_o
/ self.translator_a
/ self.communicator
:这些用来配置 LLM 交互时的提示词(Prompt)或翻译器。
translator_o
一般对应对观察信息进行文本化的翻译器;translator_a
一般对应对 LLM 文本输出进行动作解析的翻译器;communicator
则与多 Agent 间的对话通信相关。self.ENABLE_INIT_STEPS
:是否在游戏开始时进行一些初始化指令。
self.ENABLE_AUTO_WORKER_MANAGE
:是否自动管理农民采矿、分配等。
self.ENABLE_AUTO_WORKER_TRAINING
:是否自动造农民。
self.ENABLE_COMMUNICATION
:是否开启多智能体之间的文字交流或信息共享。
self.ENABLE_IMAGE_RGB
/ self.ENABLE_IMAGE_FEATURE
:是否启用对图像(RGB)或特征图(Feature)的观测。
self.ENABLE_SAVE_IMAGES
:是否将这些图像保存到本地以便后续分析或可视化。
self.LLM_SIMULATION_TIME
:如果没有真实的大模型 API Key 或不想花费调用,这里可以模拟一个“大模型思考时间”,以测试流程。
self.MAX_LLM_QUERY_TIMES
:每回合最多调用 LLM 的次数,防止反复调用造成消耗或死循环。
self.MAX_LLM_WAITING_TIME
:对 LLM 生成的等待超时时间。
self.MAX_LLM_RUNTIME_ERROR_TIME
:如果 LLM 频繁报错,可在此进行超时或停止。
self.MAX_LLM_DECISION_FREQUENCY
:LLM 每多少帧(step)才做一次决策,避免一帧一帧地调用太频繁。
self.MAX_NUM_ACTIONS
:LLM 单次决策中最多输出几条动作。
self.AGENTS
:用来存储所有子 Agent 的配置信息(如“CombatGroup0”、“Defender”等)。
self.AGENTS_ALWAYS_DISABLE
:列表中加入一些子 Agent 名,代表始终禁用这些 Agent。
reset_llm()
方法pythondef reset_llm(self, model_name=None, api_base=None, api_key=None, ...): ... for agent_name in self.AGENTS.keys(): self.AGENTS[agent_name]['llm']['model_name'] = self.model_name self.AGENTS[agent_name]['llm']['api_base'] = self.api_base self.AGENTS[agent_name]['llm']['api_key'] = self.api_key ...
这个方法就是用来在运行时动态修改 LLM 参数的,比如你要改成 GPT-4
,或者换一套 api_key
时,调用这个方法就行。该函数还会同步更新每个子 Agent 的 LLM 相关配置(model_name/api_base/api_key
等)。
auto_check()
方法pythondef auto_check(self, log_id): ...
model_name
或 api_key
未正确设置,就会输出错误日志,并启动模拟大模型模式(LLM_SIMULATION_TIME=5
等)。ENABLE_IMAGE_RGB
)却用的是不支持图像输入的模型,也会给你提示。ProtossAgentConfig
详解pythonclass ProtossAgentConfig(AgentConfig): def __init__(self): super(ProtossAgentConfig, self).__init__() self.AGENTS_ALWAYS_DISABLE = [] self.AGENTS = { 'Airborne': { 'describe': "...", 'llm': {...}, 'team': [...], 'action': {...} }, 'Builder': { ... }, ... }
ProtossAgentConfig
继承了上面的 AgentConfig
,并在构造函数(__init__
)里对 self.AGENTS
进行了更详细的配置。这里定义了多个“子 Agent”(如 'Airborne'
、'Builder'
、'Commander'
等),每个“子 Agent”有以下几个关键元素:
'describe'
:对这个子 Agent 的职责和功能的描述,LLM 在回答时可能会用到这类上下文提示。'llm'
:指定该子 Agent 用到的大模型 Prompt、翻译器、图像选项等。通常与 AgentConfig
中的属性一致,但也可以覆盖。'team'
:这个子 Agent 要控制哪些单位(如 Probe
,Stalker
,Zealot
等),以及它们属于哪个“群组”或“热键”(game_group
)。
'select_type'
表示在 pysc2 中选中单位的方式,如 'group'
代表通过已有编队热键,'select'
代表屏幕框选或点击,'select_all_type'
代表鼠标连点两下同类型单位等。{'max_unit_num': {units.Protoss.Zealot: -1}}
来限制或不限制该子 Agent 对某些单位的数量。'action'
:一个字典,每个键对应一个具体的单位类型(如 units.Protoss.Zealot
),值是一个动作列表。
PROTOSS_BASIC_ACTION_2
或者自定义的列表里,如“攻击地面”、“移动”、“释放技能”等。这里python[ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]} ... ]
'func'
里存了 (pysc2_func_id, pysc2_func, (参数...))
。当 LLM 输出 'Ability_Blink_Screen'
这个动作时,就会匹配到相应的 pysc2 function 并用指定参数执行。下面列举一些子 Agent 的示例:
'Builder'
:负责生产建筑的 Probe;'Defender'
:驻防部队,如 Stalker,用 Blink(闪现)拦截敌军;'CombatGroup0'~'CombatGroup9'
:不同编队或兵种组合,比如火力输出小组、侦查小组、AOE 小组等;'Commander'
:最高指挥官,做宏观决策,甚至可调用星核(Nexus)的质量回响(Mass Recall)等;'Developer'
:后勤与研发,负责使用 WarpGate 传送兵种、研发科技等。这里以 'Defender'
子 Agent 的部分动作为例:
python'units.Protoss.Stalker': PROTOSS_BASIC_ACTION_2 + [ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'], 'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')), (180, F.Effect_Blink_screen, ('now', 'screen'))]}, ]
PROTOSS_BASIC_ACTION_2
通常包含“Attack”、“Move”、“Stop”、“Patrol”等最基础的命令。Ability_Blink_Screen
(先选中已经分好组的 Stalker,再指定屏幕坐标闪现) 和 Select_Unit_Blink_Screen
(先通过 tag 定位 Stalker,然后闪现到指定屏幕坐标)。在文件最后还可以看到:
pythonclass TerranAgentConfig(AgentConfig): def __init__(self): super(TerranAgentConfig, self).__init__() class ZergAgentConfig(AgentConfig): def __init__(self): super(ZergAgentConfig, self).__init__()
Marine
、Marauder
、Medivac
等,虫族有 Zergling
、Hydralisk
、Mutalisk
等,对应的技能(如 Stimpack、Burrow)也需要单独配置。配置基类 (AgentConfig
)
model_name
/ api_key
),图像是否开启(ENABLE_IMAGE_RGB
/ ENABLE_IMAGE_FEATURE
),以及对大模型调用的安全限制(调用次数、等待时间等)。神族配置 (ProtossAgentConfig
)
子 Agent 的结构
describe
:对该 Agent 的定位和说明;llm
:与大模型沟通时所需的配置;team
:控制哪些单位、采用什么选中方式;action
:可执行的动作列表。未来扩展
TerranAgentConfig
和 ZergAgentConfig
将会提供人族和虫族的对应实现;简而言之:
config.py
文件是整个 LLM-PySC2 环境的“配置中心”,决定了用什么大模型、怎么调用,以及不同兵种指挥官(子 Agent)如何行动。这样,你就对 config.py
里的内容有了整体理解。如果有更多细节问题,欢迎再来讨论!
补充llm_action.py的内容,请你结合你对pysc2库的理解的思考,想一想,这些这些动作是怎么定义的# Copyright 2024, LLM-PySC2 Contributors. All Rights Reserved.
from pysc2.lib.actions import FUNCTIONS as F
from pysc2.lib import features
from llm_pysc2.lib.utils import *
from loguru import logger
import random
import re
AN_ACTION = {'name': '', 'arg': [], 'func': []}
PROTOSS_BASIC_ACTION_SMAC = [
{'name': 'Attack_Unit', 'arg': ['tag'],
'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},
]
PROTOSS_BASIC_ACTION_SMAC2 = [
{'name': 'Attack_Unit', 'arg': ['tag'],
'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},
{'name': 'Move_Screen', 'arg': ['screen'],
'func': [(331, F.Move_screen, ('queued', 'screen'))]},
]
PROTOSS_BASIC_ACTION_1 = [
{'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]},
{'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]},
{'name': 'Stop_Building', 'arg': [], 'func': [(454, F.Stop_Building_quick, ('queued'))]},
]
PROTOSS_BASIC_ACTION_2 = [
{'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]},
{'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]},
{'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]},
{'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]},
{'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]},
{'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},
]
PROTOSS_BASIC_ACTION_3 = [
{'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]},
{'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]},
{'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]},
{'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]},
{'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]},
]
PROTOSS_ACTION_WARPTRAIN = [
{'name': 'Warp_Adept_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(505, F.TrainWarp_Adept_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_DarkTemplar_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(506, F.TrainWarp_DarkTemplar_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_HighTemplar_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(507, F.TrainWarp_HighTemplar_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_Sentry_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(508, F.TrainWarp_Sentry_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_Stalker_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(509, F.TrainWarp_Stalker_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_Zealot_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),]}, # tag for WarpprismPhasing/Pylon
]
PROTOSS_ACTION_TRAIN = [
{'name': 'Train_Mothership', 'arg': [], 'func': [(541, F.Train_Mothership_quick, ('queued'))]},
{'name': 'Train_Adept', 'arg': [], 'func': [(457, F.Train_Adept_quick, ('queued'))]},
{'name': 'Train_DarkTemplar', 'arg': [], 'func': [(465, F.Train_DarkTemplar_quick, ('queued'))]},
{'name': 'Train_HighTemplar', 'arg': [], 'func': [(471, F.Train_HighTemplar_quick, ('queued'))]},
{'name': 'Train_Sentry', 'arg': [], 'func': [(491, F.Train_Sentry_quick, ('queued'))]},
{'name': 'Train_Stalker', 'arg': [], 'func': [(493, F.Train_Stalker_quick, ('queued'))]},
{'name': 'Train_Zealot', 'arg': [], 'func': [(503, F.Train_Zealot_quick, ('queued'))]},
{'name': 'Train_Oracle', 'arg': [], 'func': [(482, F.Train_Oracle_quick, ('queued'))]},
{'name': 'Train_Phoenix', 'arg': [], 'func': [(484, F.Train_Phoenix_quick, ('queued'))]},
{'name': 'Train_VoidRay', 'arg': [], 'func': [(500, F.Train_VoidRay_quick, ('queued'))]},
{'name': 'Train_Tempest', 'arg': [], 'func': [(495, F.Train_Tempest_quick, ('queued'))]},
{'name': 'Train_Carrier', 'arg': [], 'func': [(461, F.Train_Carrier_quick, ('queued'))]},
{'name': 'Train_Observer', 'arg': [], 'func': [(481, F.Train_Observer_quick, ('queued'))]},
{'name': 'Train_WarpPrism', 'arg': [], 'func': [(501, F.Train_WarpPrism_quick, ('queued'))]},
{'name': 'Train_Immortal', 'arg': [], 'func': [(473, F.Train_Immortal_quick, ('queued'))]},
{'name': 'Train_Colossus', 'arg': [], 'func': [(462, F.Train_Colossus_quick, ('queued'))]},
{'name': 'Train_Disruptor', 'arg': [], 'func': [(466, F.Train_Disruptor_quick, ('queued'))]},
]
PROTOSS_ACTION_RESEARCH = [
{'name': 'Research_ProtossAirArmor', 'arg': [],
'func': [(381, F.Research_ProtossAirArmor_quick, ('queued'))]},
{'name': 'Research_ProtossAirWeapons', 'arg': [],
'func': [(385, F.Research_ProtossAirWeapons_quick, ('queued'))]},
{'name': 'Research_WarpGate', 'arg': [],
'func': [(428, F.Research_WarpGate_quick, ('queued'))]},
{'name': 'Research_ProtossGroundArmor', 'arg': [],
'func': [(389, F.Research_ProtossGroundArmor_quick, ('queued'))]},
{'name': 'Research_ProtossGroundWeapons', 'arg': [],
'func': [(393, F.Research_ProtossGroundWeapons_quick, ('queued'))]},
{'name': 'Research_ProtossShields', 'arg': [],
'func': [(397, F.Research_ProtossShields_quick, ('queued'))]},
{'name': 'Research_Charge', 'arg': [],
'func': [(359, F.Research_Charge_quick, ('queued'))]},
{'name': 'Research_Blink', 'arg': [],
'func': [(356, F.Research_Blink_quick, ('queued'))]},
{'name': 'Research_AdeptResonatingGlaives', 'arg': [],
'func': [(351, F.Research_AdeptResonatingGlaives_quick, ('queued'))]},
{'name': 'Research_PhoenixAnionPulseCrystals', 'arg': [],
'func': [(379, F.Research_PhoenixAnionPulseCrystals_quick, ('queued'))]},
{'name': 'Research_ExtendedThermalLance', 'arg': [],
'func': [(364, F.Research_ExtendedThermalLance_quick, ('queued'))]},
{'name': 'Research_GraviticBooster', 'arg': [],
'func': [(366, F.Research_GraviticBooster_quick, ('queued'))]},
{'name': 'Research_GraviticDrive', 'arg': [],
'func': [(367, F.Research_GraviticDrive_quick, ('queued'))]},
{'name': 'Research_PsiStorm', 'arg': [],
'func': [(401, F.Research_PsiStorm_quick, ('queued'))]},
{'name': 'Research_ShadowStrike', 'arg': [],
'func': [(404, F.Research_ShadowStrike_quick, ('queued'))]},
]
PROTOSS_ACTION_BUILD = [
{'name': 'Build_Nexus_Near', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (65, F.Build_Nexus_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Assimilator_Near', 'arg': ['tag'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Nexus_Screen', 'arg': ['screen'],
'func': [(65, F.Build_Nexus_screen, ('queued', 'screen'))]},
{'name': 'Build_Assimilator_Screen', 'arg': ['screen'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen'))]},
{'name': 'Build_Pylon_Screen', 'arg': ['screen'],
'func': [(70, F.Build_Pylon_screen, ('queued', 'screen'))]},
{'name': 'Build_Gateway_Screen', 'arg': ['screen'],
'func': [(57, F.Build_Gateway_screen, ('queued', 'screen'))]},
{'name': 'Build_CyberneticsCore_Screen', 'arg': ['screen'],
'func': [(48, F.Build_CyberneticsCore_screen, ('queued', 'screen'))]},
{'name': 'Build_Forge_Screen', 'arg': ['screen'],
'func': [(55, F.Build_Forge_screen, ('queued', 'screen'))]},
{'name': 'Build_PhotonCannon_Screen', 'arg': ['screen'],
'func': [(69, F.Build_PhotonCannon_screen, ('queued', 'screen'))]},
{'name': 'Build_ShieldBattery_Screen', 'arg': ['screen'],
'func': [(525, F.Build_ShieldBattery_screen, ('queued', 'screen'))]},
{'name': 'Build_TwilightCouncil_Screen', 'arg': ['screen'],
'func': [(101, F.Build_TwilightCouncil_screen, ('queued', 'screen'))]},
{'name': 'Build_TemplarArchive_Screen', 'arg': ['screen'],
'func': [(100, F.Build_TemplarArchive_screen, ('queued', 'screen'))]},
{'name': 'Build_DarkShrine_Screen', 'arg': ['screen'],
'func': [(49, F.Build_DarkShrine_screen, ('queued', 'screen'))]},
{'name': 'Build_Stargate_Screen', 'arg': ['screen'],
'func': [(88, F.Build_Stargate_screen, ('queued', 'screen'))]},
{'name': 'Build_FleetBeacon_Screen', 'arg': ['screen'],
'func': [(54, F.Build_FleetBeacon_screen, ('queued', 'screen'))]},
{'name': 'Build_RoboticsBay_Screen', 'arg': ['screen'],
'func': [(81, F.Build_RoboticsBay_screen, ('queued', 'screen'))]},
{'name': 'Build_RoboticsFacility_Screen', 'arg': ['screen'],
'func': [(82, F.Build_RoboticsFacility_screen, ('queued', 'screen'))]},
]
PROTOSS_ACTION_EASY_BUILD = [
{'name': 'Build_Nexus_Near', 'arg': ['tag'],
'func': [(65, F.Build_Nexus_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Assimilator_Near', 'arg': ['tag'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Pylon_Near', 'arg': ['tag'],
'func': [(70, F.Build_Pylon_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Gateway_Near', 'arg': ['tag'],
'func': [(57, F.Build_Gateway_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_CyberneticsCore_Near', 'arg': ['tag'],
'func': [(48, F.Build_CyberneticsCore_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Forge_Near', 'arg': ['tag'],
'func': [(55, F.Build_Forge_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_PhotonCannon_Near', 'arg': ['tag'],
'func': [(69, F.Build_PhotonCannon_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_ShieldBattery_Near', 'arg': ['tag'],
'func': [(525, F.Build_ShieldBattery_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_TwilightCouncil_Near', 'arg': ['tag'],
'func': [(101, F.Build_TwilightCouncil_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_TemplarArchive_Near', 'arg': ['tag'],
'func': [(100, F.Build_TemplarArchive_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_DarkShrine_Near', 'arg': ['tag'],
'func': [(49, F.Build_DarkShrine_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Stargate_Near', 'arg': ['tag'],
'func': [(88, F.Build_Stargate_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_FleetBeacon_Near', 'arg': ['tag'],
'func': [(54, F.Build_FleetBeacon_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_RoboticsBay_Near', 'arg': ['tag'],
'func': [(81, F.Build_RoboticsBay_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_RoboticsFacility_Near', 'arg': ['tag'],
'func': [(82, F.Build_RoboticsFacility_screen, ('queued', 'screen_tag'))]},
]
PROTOSS_ACTION_ABILITY = [
{'name': 'Morph_WarpPrismPhasingMode', 'arg': [],
'func': [(329, F.Morph_WarpPrismPhasingMode_quick, ('queued'))]},
{'name': 'Morph_WarpPrismTransportMode', 'arg': [],
'func': [(330, F.Morph_WarpPrismTransportMode_quick, ('queued'))]},
{'name': 'Morph_SurveillanceMode', 'arg': [],
'func': [(538, F.Morph_SurveillanceMode_quick, ('queued'))]},
{'name': 'Morph_ObserverMode', 'arg': [],
'func': [(535, F.Morph_ObserverMode_quick, ('queued'))]},
{'name': 'Morph_Archon', 'arg': [],
'func': [(296, F.Morph_Archon_quick, ('queued'))]},
{'name': 'Ability_PrismaticAlignment', 'arg': [],
'func': [(244, F.Effect_VoidRayPrismaticAlignment_quick, ('queued'))]},
{'name': 'Ability_CancelPhaseShift', 'arg': [],
'func': [(453, F.Stop_quick, ('queued'))]},
{'name': 'Ability_GuardianShield', 'arg': [],
'func': [(197, F.Effect_GuardianShield_quick, ('queued'))]},
{'name': 'Ability_PulsarBeamOn', 'arg': [],
'func': [(38, F.Behavior_PulsarBeamOn_quick, ('queued'))]},
{'name': 'Ability_ChronoBoost_Unit', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (527, F.Effect_ChronoBoostEnergyCost_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_NexusMassRecall_Near', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (529, F.Effect_MassRecall_Nexus_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_MothershipMassRecall_Near', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (208, F.Effect_MassRecall_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_ShadowStride_Unit', 'arg': ['tag'],
'func': [(182, F.Effect_ShadowStride_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_GravitonBeam_Unit', 'arg': ['tag'],
'func': [(196, F.Effect_GravitonBeam_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_Blink_Screen', 'arg': ['screen'],
'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]},
{'name': 'Ability_AdeptPhaseShift_Screen', 'arg': ['screen'],
'func': [(177, F.Effect_AdeptPhaseShift_screen, ('queued', 'screen'))]},
{'name': 'Ability_PsiStorm_Screen', 'arg': ['screen'],
'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen'))]},
{'name': 'Ability_PurificationNova_Screen', 'arg': ['screen'],
'func': [(219, F.Effect_PurificationNova_screen, ('queued', 'screen'))]},
{'name': 'Ability_ForceField_Screen', 'arg': ['screen'],
'func': [(193, F.Effect_ForceField_screen, ('queued', 'screen'))]},
{'name': 'Ability_TimeWarp_Screen', 'arg': ['screen'],
'func': [(241, F.Effect_TimeWarp_screen, ('queued', 'screen'))]},
{'name': 'Ability_OracleRevelation_Screen', 'arg': ['screen'],
'func': [(214, F.Effect_OracleRevelation_screen, ('queued', 'screen'))]},
{'name': 'Ability_StasisTrap_Screen', 'arg': ['screen'],
'func': [(90, F.Build_StasisTrap_screen, ('queued', 'screen'))]},
{'name': 'Load_Unit', 'arg': ['tag'],
'func': [(287, F.Load_screen, ('queued', 'screen_tag'))]},
{'name': 'Unload_Screen', 'arg': ['screen'],
'func': [(516, F.UnloadAllAt_screen, ('queued', 'screen'))]},
{'name': 'Lock_Nexus_Near', 'arg': ['tag'],
'func': [(70, F.Build_Pylon_screen, ('queued', 'screen_tag'))]},
{'name': 'Lock_Assimilator_Near', 'arg': ['tag'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]},
{'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'],
'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')),
(180, F.Effect_Blink_screen, ('queued', 'screen'))]},
]
def find_unit_type_the_func_belongs_to(func_id, race):
if race == 'protoss':
if func_id in [541, 485]:
return units.Protoss.Nexus
if func_id in [457, 465, 471, 491, 493, 503]:
return units.Protoss.Gateway
if func_id in [482, 484, 500, 495, 461]:
return units.Protoss.Stargate
if func_id in [481, 501, 473, 462, 466]:
return units.Protoss.RoboticsFacility
if func_id in [381, 385, 428]:
return units.Protoss.CyberneticsCore
if func_id in [389, 393, 397]:
return units.Protoss.Forge
if func_id in [359, 356, 351]:
return units.Protoss.TwilightCouncil
if func_id in [379]: # 缺失两个升级选项
return units.Protoss.FleetBeacon
if func_id in [364, 366, 367]:
return units.Protoss.RoboticsBay
if func_id in [401]:
return units.Protoss.TemplarArchive
if func_id in [404]:
return units.Protoss.DarkShrine
if race == 'zerg':
pass # Support for Zerg is undergoing
if race == 'terran':
pass # Support for Terran is undergoing
return None
def find_building_size(build_name: str) -> int:
if build_name in SIZE5_BUILDING_NAMES: # Support for Terran/Zerg is undergoing
return 5
elif build_name in SIZE3_BUILDING_NAMES: # Support for Terran/Zerg is undergoing
return 3
elif build_name in SIZE2_BUILDING_NAMES: # Support for Terran/Zerg is undergoing
return 2
elif build_name in SIZE1_BUILDING_NAMES: # Support for Terran/Zerg is undergoing
return 1
else:
return 0
def find_idle_unit_tag(obs, unit_type, queued_source_unit_tag_list):
for unit in obs.observation.raw_units:
if unit.alliance == features.PlayerRelative.SELF and unit.build_progress == 100 and unit.active == 0
and unit_type is not None and unit.unit_type == unit_type and unit.tag not in queued_source_unit_tag_list:
return unit.tag
return None
def get_arg_minimap(obs, minimap: list, size_minimap, action_name) -> (tuple, bool): # 小地图坐标,校验范围
if isinstance(minimap, list) and len(minimap) == 2 and isinstance(minimap[0], (int, float)) and isinstance(minimap[1], (int, float)):
x = min(max(0, minimap[0]), size_minimap)
y = min(max(0, minimap[1]), size_minimap)
if 'Attack' in action_name and obs.observation.feature_minimap.player_relative[x][y] in [1, 2]:
return f'({x}, {y}) is alliance', False
if 'Load' in action_name and obs.observation.feature_minimap.player_relative[x][y] not in [1, 2]:
return f'({x}, {y}) is not alliance', False
if 'Follow' in action_name and obs.observation.feature_minimap.player_relative[x][y] not in [1, 2]:
return f'({x}, {y}) is not alliance', False
return (x, y), True
return f'minimap={minimap}, unknown error', False
def get_arg_screen(obs, screen: list, size_screen, action_name) -> (tuple, bool): # 屏幕坐标,校验范围
if isinstance(screen, list) and len(screen) == 2 and isinstance(screen[0], (int, float)) and isinstance(screen[1], (int, float)):
x = int(min(max(0, screen[0]), size_screen))
y = int(min(max(0, screen[1]), size_screen))
if 'Attack' in action_name and obs.observation.feature_screen.player_relative[x][y] in [1, 2]:
return f'({x}, {y}) is alliance', False
if 'Load' in action_name and obs.observation.feature_screen.player_relative[x][y] not in [1, 2]:
return f'({x}, {y}) is not alliance', False
if 'Follow' in action_name and obs.observation.feature_screen.player_relative[x][y] not in [1, 2]:
return f'({x}, {y}) is not alliance', False
return (x, y), True
return f'input arg {screen} error', False
def get_arg_screen_build(obs, screen: list, size_screen, action_name) -> (tuple, bool): # 标准建造,校验地点和建造条件
building_name = action_name.split('Build_')[1].split('_Screen')[0]
building_size = find_building_size(building_name)
if isinstance(screen, list) and len(screen) == 2 and isinstance(screen[0], (int, float)) and isinstance(screen[1], (
int, float)) and building_size != 0:
ratio = int(size_screen / SCREEN_WORLD_GRID)
x0 = int(min(max(0, screen[0]), size_screen))
y0 = int(min(max(0, screen[1]), size_screen))
x1 = int(min(max(0, screen[0]), size_screen) - ratio * (building_size - 1) / 2)
y1 = int(min(max(0, screen[1]), size_screen) - ratio * (building_size - 1) / 2)
if building_name in POWER_BUILDING_NAMES and obs.observation.feature_screen.power[x0][y0] == 0:
return f'({x0}, {y0}) need power', False
if building_name in CREEP_BUILDING_NAMES and obs.observation.feature_screen.creep[x0][y0] == 0:
return f'({x0}, {y0}) need creep', False
for i in range(building_size):
for j in range(building_size):
x = int(x1 + i * ratio)
y = int(y1 + j * ratio)
if not (0 < x < size_screen and 0 < y < size_screen):
return f'({x0}, {y0}) too close to screen edge', False
if obs.observation.feature_screen.buildable[x][y] != 1:
return f'area near ({x0}, {y0}) not buildable', False
if obs.observation.feature_screen.pathable[x][y] != 1:
return f'area near ({x0}, {y0}) not pathable', False
if obs.observation.feature_screen.player_relative[x][y] not in [0, 1]:
return f'area near ({x0}, {y0}) not blocked', False
return (x0, y0), True
return f'input arg {screen} error', False
def get_arg_world_tag(obs, tag: int, x_offset, y_offset, world_range) -> (tuple, bool): # 获取指定tag单位的世界坐标
for unit in obs.observation.raw_units:
if unit.tag == tag:
x = unit.x + x_offset
y = max(0, world_range - unit.y + y_offset)
return (x, y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag}', False
def get_arg_screen_tag(obs, tag: int, size_screen, action_name) -> (tuple, bool): # 获取指定tag单位的屏幕坐标
for unit in obs.observation.feature_units:
if unit.tag == tag:
x, y = unit.x, unit.y
if 'Attack' in action_name and unit.alliance in [1, 2]:
return f'({x}, {y}) is alliance', False
if 'Load' in action_name and unit.alliance not in [1]:
return f'({x}, {y}) is not alliance', False
if 'Follow' in action_name and unit.alliance not in [1, 2]:
return f'({x}, {y}) is not alliance', False
if 'MassRecall' in action_name and unit.alliance not in [1]:
return f'({x}, {y}) is not alliance', False
if 'Chrono_Boost' in action_name and (unit.alliance not in [1] or unit.unit_type not in BOOSTABLE_TYPE):
return f'({x}, {y}) is not boostable', False
if 'Board_' in action_name and (unit.alliance not in [1] or unit.unit_type not in TRANSPORTER_TYPE):
return f'({x}, {y}) is not a transporter', False
if unit.is_on_screen and (0 < unit.x < size_screen and 0 < unit.y < size_screen):
return (unit.x, unit.y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def get_arg_screen_tag_sclect_rect(obs, tag: int, size_screen, func_arg_name) -> (tuple, bool): # 获取指定tag附近单位群的中心坐标
for unit in obs.observation.feature_units:
if unit.tag == tag:
if unit.alliance not in [1]:
return f'({unit.y}, {unit.y}) is not alliance', False
if not (0 < unit.x < size_screen and 0 < unit.y < size_screen):
return f'unit {tag} ({unit.x}, {unit.y}) not no screen', False
if func_arg_name == 'screen' and unit.is_on_screen:
x = max(0, min(int(unit.x - size_screen / 64), size_screen))
y = max(0, min(int(unit.y - size_screen / 64), size_screen))
return (x, y), True
if func_arg_name == 'screen2' and unit.is_on_screen:
x = max(0, min(int(unit.x + size_screen / 64), size_screen))
y = max(0, min(int(unit.y + size_screen / 64), size_screen))
return (x, y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def get_arg_screen_tag_recall(obs, tag: int, size_screen, action_name) -> (tuple, bool): # 获取指定tag附近单位群的中心坐标
for unit in obs.observation.feature_units:
if unit.tag == tag:
x, y = unit.x, unit.y
if unit.alliance not in [1]:
return f'({x}, {y}) is not alliance', False
if unit.is_on_screen and (0 < unit.x < size_screen and 0 < unit.y < size_screen):
return (unit.x, unit.y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def get_arg_screen_tag_warp(obs, tag: int, size_screen, action_name) -> (tuple, bool): # 获取指定tag附近可折跃单位的坐标
n = 0
for unit in obs.observation.feature_units:
max_try = 72
if unit.tag == tag:
if unit.unit_type not in [units.Protoss.Pylon, units.Protoss.WarpPrismPhasing]:
return f'tag {unit.tag}({unit.unit_type}) is not Pylon(60) or WarpPrismPhasing(136)', False
else:
radius = [2, 3, 4, 5, 6] if unit.unit_type == units.Protoss.Pylon else [1, 2, 3]
angles = [0, 45, 90, 135, 180, 225, 270, 315]
while n < max_try:
r = radius[random.randint(0, len(radius) - 1)]
a = angles[random.randint(0, len(angles) - 1)]
x = int(unit.x + r * (size_screen / SCREEN_WORLD_GRID) * math.cos(math.radians(a)))
y = int(unit.y + r * (size_screen / SCREEN_WORLD_GRID) * math.sin(math.radians(a)))
if (0 < x < size_screen and 0 < y < size_screen) and
obs.observation.feature_screen.power[x][y] == 1 and
obs.observation.feature_screen.pathable[x][y] == 1 and
obs.observation.feature_screen.unit_type[x][y] == 0 and
obs.observation.feature_screen.build_progress[x][y] == 0 and
obs.observation.feature_screen.unit_shields[x][y] == 0:
return (x, y), True
else:
n = n + 1
if n == 36:
return f'cannot find valid position to warp unit', False
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def get_arg_screen_tag_gas_building(obs, tag: int, size_screen, action_name) -> (tuple, bool):
unit_r = None
for unit in obs.observation.raw_units:
if unit.tag == tag:
unit_r = unit
if unit_r is None:
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag}', False
for unit in obs.observation.feature_units:
if unit.tag == tag:
base_nearby = False
for unit_ in obs.observation.raw_units:
if unit_.alliance == features.PlayerRelative.SELF and unit_.unit_type in BASE_BUILDING_TYPE and
math.sqrt((unit_.x - unit_r.x) ** 2 + (unit_.y - unit_r.y) ** 2) < 10:
base_nearby = True
if not base_nearby:
return f'tag {unit.tag}({unit.unit_type}) is far away from our base building', False
if unit.unit_type not in GAS_TYPE:
return f'tag {unit.tag}({unit.unit_type}) is not VespeneGeyser(342 344 608 880 881)', False
if unit.is_on_screen and (0 < unit.x < size_screen and 0 < unit.y < size_screen):
return (unit.x, unit.y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def get_arg_world_tag_base_building(obs, tag: int, x_offset, y_offset, world_range) -> (tuple, bool):
def find_nearby_raw_mg(unit_g):
nearby_resource_unit_dict = {}
for unit in obs.observation.raw_units:
if unit.unit_type in MINERAL_TYPE:
dist = math.sqrt((unit.x - unit_g.x) ** 2 + (unit.y - unit_g.y) ** 2)
if dist < 16:
nearby_resource_unit_dict[dist] = unit
if unit.unit_type in GAS_TYPE:
dist = math.sqrt((unit.x - unit_g.x) ** 2 + (unit.y - unit_g.y) ** 2)
if dist < 16:
nearby_resource_unit_dict[dist] = unit
return nearby_resource_unit_dict.values()
def artificial_force_field_iteration_world(unit_list, x, y):
k, r, m = 0.5, 7, 1
vespene_r, vespene_m = 8, 1
mineral_r, mineral_m = 7, 1
n, bad_n, fx, fy = 0, 0, 0, 0
for unit in unit_list:
bad = False
if unit.unit_type in GAS_TYPE:
r, m = vespene_r, vespene_m
if unit.unit_type in MINERAL_TYPE:
r, m = mineral_r, mineral_m
d = math.sqrt((unit.x - x) ** 2 + (unit.y - y) ** 2)
f = k * (r - d) * m
fx += f * (x - unit.x) / d
fy += f * (y - unit.y) / d
n += 1
if unit.unit_type in GAS_TYPE and not (7 < d < 10):
bad = True
if unit.unit_type in MINERAL_TYPE and not (6 < d < 9):
bad = True
if bad:
bad_n += 1
return (x + fx / n), (y + fy / n), bad_n
for unit in obs.observation.raw_units:
if unit.tag == tag:
if unit.unit_type not in GAS_TYPE + MINERAL_TYPE:
return f'tag {unit.tag}({unit.unit_type}) is not VespeneGaser', False
mineral_list = find_nearby_raw_mg(unit)
n, x0, y0 = 0, 0, 0
for mineral in mineral_list:
n += 1
x0 += mineral.x
y0 += mineral.y
x0 = x0 / n
y0 = y0 / n
for i in range(16):
x0, y0, bad_n = artificial_force_field_iteration_world(mineral_list, x0, y0)
x = int(x0 + x_offset)
y = int(max(0, world_range - y0 + y_offset))
return (x, y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def get_arg_screen_tag_base_building(obs, tag: int, size_screen, action_name) -> (tuple, bool):
def find_nearby_screen_mg(unit_g):
ratio = size_screen / SCREEN_WORLD_GRID
nearby_resource_unit_dict = {}
for unit in obs.observation.feature_units:
if unit.unit_type in MINERAL_TYPE:
dist = math.sqrt((unit.x - unit_g.x) ** 2 + (unit.y - unit_g.y) ** 2)
if dist < 16 * ratio:
nearby_resource_unit_dict[dist] = unit
if unit.unit_type in GAS_TYPE:
dist = math.sqrt((unit.x - unit_g.x) ** 2 + (unit.y - unit_g.y) ** 2)
if dist < 16 * ratio:
nearby_resource_unit_dict[dist] = unit
return nearby_resource_unit_dict.values()
def artificial_force_field_iteration_screen(unit_list, x, y):
ratio = size_screen / SCREEN_WORLD_GRID
k, r, m = 0.5, 7 * ratio, 1
vespene_r, vespene_m = 8 * ratio, 1
mineral_r, mineral_m = 7 * ratio, 1
n, bad_n, fx, fy = 0, 0, 0, 0
for unit in unit_list:
bad = False
if unit.unit_type in GAS_TYPE:
r, m = vespene_r, vespene_m
if unit.unit_type in MINERAL_TYPE:
r, m = mineral_r, mineral_m
d = math.sqrt((unit.x - x) ** 2 + (unit.y - y) ** 2)
f = k * (r - d) * m
fx += f * (x - unit.x) / d
fy += f * (y - unit.y) / d
n += 1
if unit.unit_type in GAS_TYPE and not (7 * ratio < d < 10 * ratio):
bad = True
if unit.unit_type in MINERAL_TYPE and not (6 * ratio < d < 9 * ratio):
bad = True
if bad:
bad_n += 1
return (x + fx / n), (y + fy / n), bad_n
building_name = action_name.split('Screen')[0].split('')[1] # Build/Lock
building_size = find_building_size(building_name)
for unit in obs.observation.feature_units:
if unit.tag == tag or (unit.unit_type in GAS_TYPE + MINERAL_TYPE and unit.is_on_screen):
if unit.unit_type not in GAS_TYPE + MINERAL_TYPE:
return f'tag {unit.tag}({unit.unit_type}) is not VespeneGaser', False
mineral_gas_list = find_nearby_screen_mg(unit)
n, x0, y0 = 0, 0, 0
for mineral in mineral_gas_list:
n += 1
x0 += mineral.x
y0 += mineral.y
x = x0 / n
y = y0 / n
bad_n = len(mineral_gas_list)
for i in range(16):
x, y, bad_n = artificial_force_field_iteration_screen(mineral_gas_list, x, y)
x, y = int(x), int(y)
if bad_n > 3:
return f'({x}, {y}) may be a bad position for base building', False
if not (0 < x < size_screen and 0 < y < size_screen):
return f'({x}, {y}) too close to screen edge', False
if obs.observation.feature_screen.buildable[x][y] != 1:
return f'area near ({x}, {y}) not buildable', False
if obs.observation.feature_screen.pathable[x][y] != 1:
return f'area near ({x}, {y}) not pathable', False
if obs.observation.feature_screen.player_relative[x][y] not in [0, 1]:
return f'area near ({x}, {y}) not blocked', False
if unit.is_on_screen and (0 < x < size_screen and 0 < y < size_screen):
return (x, y), True
tag = hex(tag) if isinstance(tag, int) else tag
return f'cannot find unit {tag} on screen', False
def add_func_for_select_workers(self, obs, action):
action_name = action['name']
action_arg = action['arg']
action_func = action['func']
if not ('Select_Workers_' in action_name):
return action
full_shape_action = None
logger.debug(self.action_list)
func_id, func, arg_type = action_func[0]
source_unit_tag = None
for unit in obs.observation.raw_units:
if unit.alliance == features.PlayerRelative.SELF and unit.unit_type in WORKER_TYPE and
unit.is_on_screen and (0 < unit.x < self.size_screen and 0 < unit.y < self.size_screen):
source_unit_tag = unit.tag
if source_unit_tag is None:
logger.error(
f"[ID {self.log_id}] Agent {self.name}, Can not find source unit type for func {actions.FUNCTIONS[func_id].name}")
else:
logger.debug(f"[ID {self.log_id}] Agent {self.name}, find source unit worker {source_unit_tag}")
if source_unit_tag is not None:
full_shape_action = {'name': action_name, 'arg': [], 'func':
[(2, actions.FUNCTIONS.select_point, ['select_all_type', int(source_unit_tag)])] + action['func']}
else:
full_shape_action = {'name': 'No_Operation', 'arg': [], 'func':
[(0, actions.FUNCTIONS.no_op, {})]}
return full_shape_action
def add_func_for_train_and_research(self, obs, action):
action_name = action['name']
action_arg = action['arg']
action_func = action['func']
if not ('Train_' in action_name or 'Research_' in action_name):
return action
full_shape_action = None
queued_source_unit_tag_list = [] # 已经准备训练/升级单位的建筑,用于避免重复选中
logger.debug(self.action_list)
func_id, func, arg_type = action_func[0]
source_unit_type = find_unit_type_the_func_belongs_to(func_id, self.race)
source_unit_tag = find_idle_unit_tag(obs, source_unit_type, queued_source_unit_tag_list)
if source_unit_type is None:
logger.error(
f"[ID {self.log_id}] Agent {self.name}, Can not find source unit type for func {actions.FUNCTIONS[func_id].name}")
elif source_unit_tag is None:
logger.error(
f"[ID {self.log_id}] Agent {self.name}, Can not find source unit of {str(units.get_unit_type(source_unit_type))} type")
else:
queued_source_unit_tag_list.append(source_unit_tag)
if source_unit_tag is not None:
full_shape_action = {'name': action_name, 'arg': [], 'func':
[(573, actions.FUNCTIONS.llm_pysc2_move_camera, [int(source_unit_tag)]),
(2, actions.FUNCTIONS.select_point, ['select', int(source_unit_tag)])] + action['func']}
else:
full_shape_action = {'name': 'No_Operation', 'arg': [], 'func':
[(0, actions.FUNCTIONS.no_op, {})]}
return full_shape_action
class BaseTranslatorA:
def init(self):
pass
def translate(self, obs) -> "list of [(func_id, func_call)]":
pass
class DefaultTranslatorA(BaseTranslatorA):
def init(self, name, log_id, config):
super(DefaultTranslatorA, self).init()
self.agent_name = name
self.ACTION_SPACE = config.AGENTS[name]['action']
self.ACTION_SPACE_DICT = {}
for unit_type in self.ACTION_SPACE.keys():
for action in self.ACTION_SPACE[unit_type]:
self.ACTION_SPACE_DICT[action['name']] = action
self.log_id = log_id
logger.info(f"[ID {self.log_id}] {name} DefaultTranslatorA initialized")
def translate(self, raw_text_a: str):
action_list_dict = {}
action_lists, action_lists2 = [], []
team_actions, team_actions2 = [], []
processed_text_a, team_name = '', ''
lines = raw_text_a.splitlines()
start_recognize = False
first_function = True
for line in lines:
if ("Actions:" in line) or ("Action:" in line) or
("actions:" in line) or ("action:" in line):
processed_text_a = "Actions:"
start_recognize = True
if ("Communications:" in line) or ("Communication:" in line) or
("communications:" in line) or ("Communication:" in line):
start_recognize = False
if start_recognize:
if ("Team" in line and ":" in line) or ("team" in line and ":" in line):
team_name = line.split("eam ")[1].split(":")[0] # Team/team xxxx: --> xxxx
processed_text_a += f"\n\tTeam {team_name}:"
if len(team_actions) != 0:
action_lists.append(team_actions)
action_lists2.append(team_actions2)
action_list_dict[team_name] = team_actions
first_function = True
team_actions, team_actions2 = [], []
elif "<" in line and ">" in line:
action_text = line.split("<")[1].split(">")[0]
action_name = action_text.split("(")[0]
action_args = action_text.split("(")[1].split(")")[0]
action_valid, tag, tag2, tag3, x, y = True, None, None, None, None, None
if "0x" in action_args:
tag = int(re.findall(r'0x\w+', action_args)[0], 16)
if len(re.findall(r'0x\w+', action_args)) > 1:
tag2 = int(re.findall(r'0x\w+', action_args)[1], 16)
if len(re.findall(r'0x\w+', action_args)) > 2:
tag3 = int(re.findall(r'0x\w+', action_args)[2], 16)
if "[" in action_args:
x = float(re.findall(r'', action_args)[0].split("]")[0])
text# 在动作空间中查找action_name对应的action if action_name in self.ACTION_SPACE_DICT.keys(): action = self.ACTION_SPACE_DICT[action_name] else: logger.error(f"translator unable to find {action_name}") action = {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]} action_valid = False # 将识别出的动作参数填入函数参数元组中 new_func_triples, new_func_triples2 = [], [] for func_triple in action['func']: # func_triple 形如 (0, F.no_op, ()) new_func_triple, new_func_triple2, new_func_args = [], [], [] func_args = func_triple[2] if len(list(func_args)) > 0: if not isinstance(func_args, tuple): func_args = [func_args] for arg in list(func_args): if arg == "now": new_func_args.append('now') if arg == "queued": if first_function: new_func_args.append('now') else: new_func_args.append('queued') if arg == "select": new_func_args.append('select') if arg in ["screen_tag", "minimap_tag", "world_tag", "screen1_tag", "screen2_tag"]: if tag is not None: new_func_args.append(tag) else: new_func_args.append('error') if arg in ["screen_tag2", "minimap_tag2", "world_tag2", "screen1_tag2", "screen2_tag2"]: if tag2 is not None: new_func_args.append(tag2) else: new_func_args.append('error') if arg in ["screen", "minimap"]: if (x is not None) and (y is not None): new_func_args.append([x, y]) else: new_func_args.append('error') # if arg in ["screen1_tag", "screen2_tag"]: # if tag is not None: # new_func_args.append(tag) # else: # new_func_args.append('error') # if arg in ["screen1_tag2", "screen2_tag2"]: # if tag2 is not None: # new_func_args.append(tag2) # else: # new_func_args.append('error') if 'error' not in new_func_args and first_function and 'now' in new_func_args: first_function = False if 'error' in new_func_args: action_valid = False new_func_triple.append(func_triple[0]) new_func_triple.append(func_triple[1]) new_func_triple.append(tuple(new_func_args)) new_func_triples.append(tuple(new_func_triple)) new_func_triple2.append(func_triple[0]) new_func_triple2.append(func_triple[1].name) new_func_triple2.append(tuple(new_func_args)) new_func_triples2.append(tuple(new_func_triple2)) if action_valid: team_actions.append({'name': action['name'], 'arg': action['arg'], 'func': new_func_triples}) team_actions2.append({'name': action['name'], 'arg': action['arg'], 'func': new_func_triples2}) processed_text_a += f"\n\t\t<{action_text}>" else: team_actions.append({'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}) team_actions2.append({'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}) else: pass if len(team_actions) != 0: action_lists.append(team_actions) action_lists2.append(team_actions2) action_list_dict[team_name] = team_actions first_function = True team_actions, team_actions2 = [], [] return action_lists, action_list_dict, processed_text_a
PROTOSS_FACTORY = {'default': DefaultTranslatorA}
TERRAN_FACTORY = {}
ZERG_FACTORY = {}
FACTORY = {
'protoss': PROTOSS_FACTORY,
'terran': TERRAN_FACTORY,
'zerg': ZERG_FACTORY,
}
if name == "main":
from llm_pysc2.agents.configs.config import ProtossAgentConfig
config = ProtossAgentConfig()
def show(config):
for name in config.AGENTS.keys():
agent_actions = config.AGENTS[name]['action']
print(name)
for unit_type in agent_actions.keys():
print(f"\t{str(units.get_unit_type(unit_type))}")
for i in range(len(agent_actions[unit_type])):
action = agent_actions[unit_type][i]
if len(action['arg']) == 0:
print(f"\t\t <{action['name']}()>")
for i in range(len(agent_actions[unit_type])):
action = agent_actions[unit_type][i]
if len(action['arg']) == 1 and 'minimap' in action['arg']:
print(f"\t\t <{action['name']}({action['arg'][0]})>")
for i in range(len(agent_actions[unit_type])):
action = agent_actions[unit_type][i]
if len(action['arg']) == 1 and 'screen' in action['arg']:
print(f"\t\t <{action['name']}({action['arg'][0]})>")
for i in range(len(agent_actions[unit_type])):
action = agent_actions[unit_type][i]
if len(action['arg']) == 1 and 'tag' in action['arg']:
print(f"\t\t <{action['name']}({action['arg'][0]})>")
for i in range(len(agent_actions[unit_type])):
action = agent_actions[unit_type][i]
if len(action['arg']) == 2:
print(f"\t\t <{action['name']}({action['arg'][0]}, {action['arg'][1]})>")
show(config)
translator = DefaultTranslatorA('CombatGroup1', 0, config)
text =
"""
Analysis:
We should do xxx and xxx.
Actions:
Team Stalker-1:
<Move_Screen([2, 9])>
<Attack_Unit(0x200540001)>
Team Stalker-2:
<Ability_Blink_Screen([33, 96])>
<Move_Minimap([24, 54])>
Team Stalker-3:
<Select_Unit_Blink_Screen(0x1007c0001 ,[33, 96])>
<Hold_Position()>
"""
actions, processed_text_a, _ = translator.translate(text)
print(f"\n\ntext to translator:{text}")
print(f"detected action from translator:\n{actions}\n")
print(f"detected text_a from translator:\n{processed_text_a}\n")
print(f"agent {translator.agent_name} action names: {translator.ACTION_SPACE_DICT.keys()}")
print(f"agent {translator.agent_name} action num: {len(translator.ACTION_SPACE_DICT.keys())}")
下面这份 llm_action.py
文件主要是为 LLM-PySC2 项目定义“动作空间”(Action Space)以及一系列动作解析、动作执行相关的辅助函数。它将 pysc2 提供的各种原子操作(FUNCTIONS.*
)与我们自定义的“动作名称”和“动作参数”关联起来,让大语言模型(LLM)能够通过可读的文本指令触发实际的游戏操作。
为了更好地理解,这里会先简单回顾一下 pysc2 的动作机制,然后再结合代码,带你看一下这些“动作”究竟是如何定义和组织的。
FUNCTIONS.*
(例如 FUNCTIONS.Move_screen
, FUNCTIONS.Attack_screen
等)。function_id
)和对应的 “动作方法”,还需要传入参数(例如坐标、序列化方式是 queued
还是 now
等)。(func_id, func_call, args_tuple)
三元组)去调用这些 action,才能真正驱动游戏。示例:
python(12, F.Attack_screen, ('queued', (x, y)))
表示调用 action id=12(对应 FUNCTIONS.Attack_screen
),在“队列模式”(queued
)下,让选中的单位对屏幕坐标 (x,y)
发动攻击。
在本文件中,我们把“神族”可能用到的一系列指令分门别类地整理出来,比如 PROTOSS_BASIC_ACTION_2
、PROTOSS_ACTION_BUILD
、PROTOSS_ACTION_TRAIN
等。每一个动作都包含:
name
:这个动作在大语言模型(LLM)那里呈现的文本名称(例如 'Attack_Unit'
, 'Build_Pylon_Screen'
等),LLM 只需生成对应的 <动作名称(参数...)>
即可。arg
:动作期望的参数列表,如 ['tag']
, ['screen']
, ['minimap']
。
arg
中有 'tag'
,就说明需要一个单位或目标的“tag”参数(0x...
十六进制 ID);'screen'
,就意味着需要一个屏幕坐标 [x, y]
;'minimap'
,就意味着需要一个小地图坐标 [x, y]
。func
:一个 list,内部是一个或多个“(func_id, func_call, args_tuple)”形式的三元组,定义了如何将这个动作映射到 pysc2 的实际操作。
就表明,当 LLM 输出python{ 'name': 'Attack_Unit', 'arg': ['tag'], 'func': [ (12, F.Attack_screen, ('queued', 'screen_tag')) ] }
<Attack_Unit(0xABCD1234)>
这样的动作时,会在后台把 0xABCD1234
转成屏幕坐标(screen_tag
),然后调用 FUNCTIONS.Attack_screen
,并附加参数 (queued, (x,y))
。通过这种映射,LLM 输出 → 自定义动作解析 → pysc2 原子操作,就形成了一个完整的闭环。
一些常量或预定义动作列表:
PROTOSS_BASIC_ACTION_SMAC
/ PROTOSS_BASIC_ACTION_SMAC2
:为 SMAC (StarCraft Multi-Agent Challenge) 这种小规模对战做的基础指令(Attack_Unit
, Move_Screen
等)。PROTOSS_BASIC_ACTION_1
, PROTOSS_BASIC_ACTION_2
, PROTOSS_BASIC_ACTION_3
:根据“单位是否能够攻击”或“单位属于建筑”等维度分类出的基础动作集。PROTOSS_ACTION_BUILD
, PROTOSS_ACTION_TRAIN
, PROTOSS_ACTION_RESEARCH
, PROTOSS_ACTION_WARPTRAIN
等:神族的建造、训练、科技研发、折跃造兵等各类操作。PROTOSS_ACTION_ABILITY
:各种特殊技能(比如 Blink
, PsiStorm
, ForceField
, MassRecall
等)。一系列参数检查与辅助函数:
get_arg_screen()
, get_arg_minimap()
, get_arg_screen_tag()
等,这些函数会验证 LLM 给的坐标或 tag 是否可用、是否越界、是否是友军单位、是否可建造等。find_unit_type_the_func_belongs_to()
:在做 Train_xxx
/ Research_xxx
动作时,自动找到“应该在哪个建筑”执行此动作(如 Train_Stalker
应该在 Gateway
/ WarpGate
上执行)。find_idle_unit_tag()
:从当前观测中寻找一个“闲置的”、“符合要求的”建筑或单位 tag,以执行后续动作。两个“补齐”函数:
add_func_for_select_workers()
:当要“选中农民去干活”时,需要先执行一个 select_point
动作选到农民,再执行后续动作;这个函数就是在动作列表里自动插入那一步“选中农民”的操作。add_func_for_train_and_research()
:当要“训练单位或研究科技”时,需要先把相机移动到对应建筑,再选中建筑,最后才能做真正的 F.Train_xxx_quick
/ F.Research_xxx_quick
。所以这里同样是自动拼接几步动作。BaseTranslatorA
/ DefaultTranslatorA
:
DefaultTranslatorA.translate()
会把字符串中形如 "<ActionName(参数...)>"
的段落解析出来,匹配到我们上面定义在动作空间(ACTION_SPACE_DICT
)里的动作,并把各种参数插进相应的 ('queued', 'screen_tag')
或 'minimap'
里去,最终得到一系列 (func_id, func_call, args_tuple)
三元组。pythonPROTOSS_BASIC_ACTION_2 = [ { 'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))] }, { 'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())] }, { 'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))] }, { 'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))] }, { 'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))] }, { 'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))] }, ]
Stalker
, Zealot
等。Move_Screen
:如果 LLM 产出了 <Move_Screen([10,20])>
,则在翻译时,'screen'
这个参数就被替换为实际坐标 (10,20)
,最后返回 (func_id=331, F.Move_screen, ('queued', (10,20)))
。Attack_Unit
:如果 LLM 产出了 <Attack_Unit(0x1BEEF)>
,则 'tag'
可能被解析为 0x1BEEF (十六进制) → screen坐标 (x,y)
,执行 (func_id=12, F.Attack_screen, ('queued', (x,y)))
。神族有很多特殊操作,比如折跃造兵需要先 select_warp_gates
再对屏幕点进行造兵;建造气矿(Build_Assimilator_Near
)需要对一个气矿 tag
做屏幕坐标转化,且要检测附近是否有己方基地。
代码里就定义了这些“复杂动作”:
PROTOSS_ACTION_WARPTRAIN
:
python{ 'name': 'Warp_Stalker_Near', 'arg': ['tag'], 'func': [ (8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')), (509, F.TrainWarp_Stalker_screen, ('queued', 'screen_tag')) ] }
先选中所有空闲的 WarpGate
(func_id=8),再移动相机到 tag
(func_id=573 是自定义的 llm_pysc2_move_camera
),然后在屏幕坐标上折跃出 Stalker。
PROTOSS_ACTION_BUILD
:都是“在屏幕某坐标建造某建筑”或“在某个单位 tag 附近建造(气矿)”的操作。内部常见 (40, F.Build_Assimilator_screen, ('queued','screen'))
之类。
PROTOSS_ACTION_TRAIN
/ PROTOSS_ACTION_RESEARCH
:大多是 (func_id, F.Train_xxx_quick, ('queued'))
或 (func_id, F.Research_xxx_quick, ('queued'))
,再通过 add_func_for_train_and_research()
注入“先选中建筑”的步骤。
get_arg_screen(obs, screen: list, size_screen, action_name)
检查传进来的屏幕坐标 [x,y]
是否在地图范围内,若要“Attack”,还要看 [x,y]
位置是不是敌军或中立单位,不能攻击自己的单位等。
get_arg_screen_build(obs, screen: list, size_screen, action_name)
用于建造建筑时的额外检查:
Pylon
这类需要电力覆盖的建筑,还要查看 obs.observation.feature_screen.power[x][y]
是否为 1。get_arg_screen_tag(obs, tag: int, size_screen, action_name)
用来把一个单位 tag(0xABCD1234)转换成屏幕坐标 (x,y)
。如果找不到这个单位或者单位不在屏幕里,就报错或返回不合法标志。
get_arg_screen_tag_warp(...)
专门给折跃功能用的,因为需要找到 Pylon
或 WarpPrismPhasing
周围的一块可折跃区域,然后随机尝试找 (x,y)
并检查该区域是否可造单位。
find_unit_type_the_func_belongs_to(func_id, race)
:给定一个动作 ID 和种族,告诉你这个动作应该在哪类单位/建筑上执行,比如 (func_id=493) -> Gateway(或WarpGate) -> Stalker训练动作
。find_idle_unit_tag(obs, unit_type, queued_source_unit_tag_list)
:在场景里搜寻符合 unit_type
的单位/建筑,并且 active == 0
、建造完毕 build_progress == 100
,且不在“已排队”列表里。找到了,就返回它的 tag 来做训练/研究操作。这些辅助函数为我们解决了繁琐的逻辑:大模型只需要说“<Train_Stalker()>
”,然后我们内部就自动选出一个空闲的门(Gateway
或 WarpGate
)去执行实际的 Train_Stalker_quick
。
DefaultTranslatorA
简介在文件末尾有个 DefaultTranslatorA
类,它继承了 BaseTranslatorA
,是一个默认的动作翻译器。其核心方法是 translate()
:
pythondef translate(self, raw_text_a: str): action_lists, action_list_dict, processed_text_a = ... # 1. 先把原始文本按行分割 # 2. 寻找包含 "Actions:" 开头的部分,说明是动作区 # 3. 每当遇到 "Team XXX:",就说明后面跟的是某个小队的动作 # 4. 每当遇到 "<ActionName(args)>" 这样的格式,就解析: # - ActionName = "Attack_Unit", args = "0x1234..." 或者 "[10, 20]" # - 在 self.ACTION_SPACE_DICT 中找到对应的动作 # - 把args替换进 func 里 # - 最终形成 pysc2 可执行的列表 # 5. 如果出现解析不了的情况,就返回 No_Operation return action_lists, action_list_dict, processed_text_a
通过这种机制,我们可以让大模型输出一段类似:
textActions: Team Stalker-1: <Move_Screen([10,20])> <Attack_Unit(0x200540001)>
翻译器就会将其变成可执行的形如:
python[ {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', (10,20)))]}, {'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', (x,y))]} # 需要先把 tag -> 屏幕坐标 ]
再交给后续的 Agent 逐帧执行。
本文件的核心目标:
<Attack_Unit(0x1234)>
或 <Build_Nexus_Near(0x5678)>
时,我们能在观察里找对坐标、检查是否能建造或攻击。与 config.py
中的子 Agent 配置相结合:
config.py
里,你会看到“Defender”、“Builder”、“CombatGroupX” 等子 Agent。每个 Agent 只会引用“自己能执行的动作集”,比如 Stalker 可能引用 PROTOSS_BASIC_ACTION_2
,而 Probe 会引用 PROTOSS_BASIC_ACTION_2 + PROTOSS_ACTION_BUILD
等。翻译器:
DefaultTranslatorA
会把文本指令解析为动作三元组,然后再由 Agent 逐个执行,或再补齐“选择单位”、“移动摄像机”等步骤。举个例子:
当 LLM 输出:
textActions: Team Defender: <Ability_Blink_Screen([50, 100])>
最终在游戏里就会转成:
(180, F.Effect_Blink_screen, ('queued',(50,100)))
(pysc2 识别的 Blink 动作 ID=180)(3, F.select_rect, ...)
等命令确保我们选到正确的单位。大致来说,llm_action.py
就是把 PySC2 的底层 API(一堆函数 ID)包装成一个个可读的指令,并辅以各种参数校验逻辑和翻译器,让大模型可以用“自然语言风格”的 <Move_Screen([10,20])>
来做高层指挥。
如果你想扩展或修改动作空间(比如给 Terran 或 Zerg 自定义新的建造、技能动作),就可以模仿这里的写法:在某个列表里新增 { 'name': 'xxx', 'arg': [...], 'func': [...] }
,并在 config.py
里让对应的子 Agent 用到它即可。
PROTOSS_BASIC_ACTION_SMAC / PROTOSS_BASIC_ACTION_SMAC2:为 SMAC (StarCraft Multi-Agent Challenge) 这种小规模对战做的基础指令(Attack_Unit, Move_Screen等)。
PROTOSS_BASIC_ACTION_1, PROTOSS_BASIC_ACTION_2, PROTOSS_BASIC_ACTION_3:根据“单位是否能够攻击”或“单位属于建筑”等维度分类出的基础动作集。
PROTOSS_ACTION_BUILD, PROTOSS_ACTION_TRAIN, PROTOSS_ACTION_RESEARCH, PROTOSS_ACTION_WARPTRAIN 等:神族的建造、训练、科技研发、折跃造兵等各类操作。
PROTOSS_ACTION_ABILITY:各种特殊技能(比如 Blink, PsiStorm, ForceField, MassRecall 等)
详细展开讲一讲
在 llm_action.py
文件里,神族(Protoss)相关的动作被拆分成多个“动作列表”,每个列表针对不同的场景或单位类型做了分门别类的整理。下面就来详细说明一下这些动作列表都包含哪些内容,以及为什么要这样区分。
PROTOSS_BASIC_ACTION_SMAC
/ PROTOSS_BASIC_ACTION_SMAC2
SMAC(StarCraft Multi-Agent Challenge) 是一个专注于小规模部队微操的环境,往往只需要攻击或移动这类最基本的战斗操作。而不涉及建造基地、采矿、运营等宏观内容。所以它的动作列表相对简单。
PROTOSS_BASIC_ACTION_SMAC
pythonPROTOSS_BASIC_ACTION_SMAC = [ { 'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))] }, # 原本还注释了一个 Select_Unit_Attack_Unit 的示例 ]
Attack_Unit
:允许指定一个单位的 tag
,再把它转换成屏幕坐标,使用 FUNCTIONS.Attack_screen
发动攻击。PROTOSS_BASIC_ACTION_SMAC2
pythonPROTOSS_BASIC_ACTION_SMAC2 = [ { 'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))] }, { 'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))] }, # 同样注释掉了对单个单位的select后再move ]
在 SMAC2
中,除了攻击之外,增加了移动 (Move_Screen) 的操作,让 Agent 可以控制部队位置,从而进行走位或撤退。
所以,SMAC 系列动作列表非常精简,目的就是满足微操需求:** 看得到的敌人就攻击**、看得到的空地就移动,没有太多其他功能。
PROTOSS_BASIC_ACTION_1
/ PROTOSS_BASIC_ACTION_2
/ PROTOSS_BASIC_ACTION_3
在真正的星际争霸 II 大地图环境中,神族有很多不同类型的单位或建筑,为了代码组织和可复用性,这里把一些最常用的基础指令拆成了三类:
PROTOSS_BASIC_ACTION_1
:更多是给建筑(Building)或特殊单位用的,通常包含 “Stop”、“No_Operation” 等操作,或者建筑专属的 Stop_Building
。PROTOSS_BASIC_ACTION_2
:给能够攻击的作战单位用,比如 Stalker, Zealot, Immortal
等。包含攻击、移动、**站住(Hold Position)**等常见操作。PROTOSS_BASIC_ACTION_3
:给不能攻击的单位用,例如没有攻击手段的运输机、观察者等。它会有移动、Hold Position,但没有 Attack 指令。PROTOSS_BASIC_ACTION_1
pythonPROTOSS_BASIC_ACTION_1 = [ {'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]}, {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Stop_Building', 'arg': [], 'func': [(454, F.Stop_Building_quick, ('queued'))]}, ]
Stop
:对已选中的建筑/单位执行停止命令。No_Operation
:无操作,用来占位或忽略。Stop_Building
:用于取消当前建筑的生产指令或停止正在建筑上执行的队列,比如“停止升级防护罩”的动作。PROTOSS_BASIC_ACTION_2
pythonPROTOSS_BASIC_ACTION_2 = [ {'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]}, {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]}, {'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]}, {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]}, {'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]}, # 也可以加 Board_WarpPrism / Attack_Screen 等拓展 ]
Hold_Position
:原地待命,不跟随任何移动或攻击行为。Move_Minimap
/ Move_Screen
:通过小地图坐标或屏幕坐标来移动单位。Attack_Unit
:基于 tag 攻击指定单位(内部会转换成屏幕坐标)。PROTOSS_BASIC_ACTION_3
pythonPROTOSS_BASIC_ACTION_3 = [ {'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]}, {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]}, {'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]}, {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]}, ]
这里没有 Attack_Unit
,因为这类单位本身不具备攻击能力,比如 Observer
、WarpPrism
(非相位模式时)或一些纯辅助单位。
神族除了基本的移动攻击,还有建造、生产、科技研究、折跃造兵等操作,这些就放在单独的动作列表里,每一个动作都对应具体的 pysc2 FUNCTIONS
。
PROTOSS_ACTION_BUILD
— 建造相关pythonPROTOSS_ACTION_BUILD = [ # 建造基地 {'name': 'Build_Nexus_Near', 'arg': ['tag'], 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (65, F.Build_Nexus_screen, ('queued', 'screen_tag'))]}, # 气矿 {'name': 'Build_Assimilator_Near', 'arg': ['tag'], 'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]}, # 直接建在屏幕坐标上 {'name': 'Build_Pylon_Screen', 'arg': ['screen'], 'func': [(70, F.Build_Pylon_screen, ('queued', 'screen'))]}, # 类似地还有 `Build_Gateway_Screen` / `Build_CyberneticsCore_Screen` 等等 ]
Build_..._Near(tag)
:先通过 tag
找到一个矿点或者电力点,然后在其附近下达建造指令。例如 Build_Assimilator_Near(0xABC)
就是在对应气矿附近建造气矿。Build_..._Screen([x, y])
:在屏幕上的某个坐标下达建造命令。select_point
选农民的命令。PROTOSS_ACTION_TRAIN
— 训练(造兵)pythonPROTOSS_ACTION_TRAIN = [ {'name': 'Train_Zealot', 'arg': [], 'func': [(503, F.Train_Zealot_quick, ('queued'))]}, {'name': 'Train_Stalker', 'arg': [], 'func': [(493, F.Train_Stalker_quick, ('queued'))]}, {'name': 'Train_Immortal', 'arg': [], 'func': [(473, F.Train_Immortal_quick, ('queued'))]}, # 还有Oracle, Phoenix, VoidRay, etc... ]
pysc2
的 Train_xxx_quick
函数 ID,例如 Train_Zealot_quick
ID=503。Train_xxx_quick
。add_func_for_train_and_research()
函数所做的事:自动帮你插入移动相机与选中建筑的操作。PROTOSS_ACTION_RESEARCH
— 科技研究pythonPROTOSS_ACTION_RESEARCH = [ {'name': 'Research_WarpGate', 'arg': [], 'func': [(428, F.Research_WarpGate_quick, ('queued'))]}, {'name': 'Research_Blink', 'arg': [], 'func': [(356, F.Research_Blink_quick, ('queued'))]}, {'name': 'Research_Charge', 'arg': [], 'func': [(359, F.Research_Charge_quick, ('queued'))]}, {'name': 'Research_PsiStorm', 'arg': [], 'func': [(401, F.Research_PsiStorm_quick, ('queued'))]}, # 还有空军武器升级、地面护甲升级、护盾升级等 ]
CyberneticsCore
, Forge
, TemplarArchive
等)完成;所以在做研究前同样要选中这些建筑。add_func_for_train_and_research()
来自动拼接移动相机与选中建筑的步骤。PROTOSS_ACTION_WARPTRAIN
— 折跃造兵pythonPROTOSS_ACTION_WARPTRAIN = [ { 'name': 'Warp_Stalker_Near', 'arg': ['tag'], 'func': [ (8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')), (509, F.TrainWarp_Stalker_screen, ('queued', 'screen_tag')) ] }, # 还有 Warp_Zealot_Near, Warp_Sentry_Near 等 ]
select_warp_gates
,动作 ID=8),然后移动相机到某个 Pylon
或 WarpPrismPhasing
的位置,再折跃出所需兵种。tag
通常是提供折跃点的单位 ID,比如一个电力范围内的水晶塔(Pylon
)或“相位模式”的运输机(WarpPrismPhasing
)。get_arg_screen_tag_warp()
去找一个可用的坐标 (x, y)
来放下兵种。PROTOSS_ACTION_ABILITY
神族有很多单位技能,比如闪现 (Blink)、灵能风暴 (PsiStorm)、力场 (ForceField)、卫戍之盾 (GuardianShield)、折跃棱镜切换模式等等。这些被归到 PROTOSS_ACTION_ABILITY
里。
pythonPROTOSS_ACTION_ABILITY = [ {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Ability_PsiStorm_Screen', 'arg': ['screen'], 'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen'))]}, {'name': 'Ability_ForceField_Screen', 'arg': ['screen'], 'func': [(193, F.Effect_ForceField_screen, ('queued', 'screen'))]}, {'name': 'Ability_NexusMassRecall_Near', 'arg': ['tag'], 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (529, F.Effect_MassRecall_Nexus_screen, ('queued', 'screen_tag'))]}, # 以及其他技能 ]
Ability_Blink_Screen([x, y])
:对已选中的带 Blink 技能的单位(如 Stalker)执行闪现。Ability_PsiStorm_Screen([x, y])
:先选中高阶圣堂武士(High Templar),对指定区域释放灵能风暴。Ability_NexusMassRecall_Near(0xABC)
:移动相机到某个单位所在位置,然后对屏幕该位置进行大规模召回。这些特殊技能的func 大多是 (func_id, F.Effect_XXX_screen, ('queued', 目标坐标))
或者一些morph 动作 (Morph_WarpPrismPhasingMode)
等。
SMAC 系列(PROTOSS_BASIC_ACTION_SMAC
, PROTOSS_BASIC_ACTION_SMAC2
)
基础动作(PROTOSS_BASIC_ACTION_1/2/3
)
建造、训练、研究、折跃(PROTOSS_ACTION_BUILD
/ PROTOSS_ACTION_TRAIN
/ PROTOSS_ACTION_RESEARCH
/ PROTOSS_ACTION_WARPTRAIN
)
Build_Pylon
, Train_Stalker
, Research_Blink
, Warp_Zealot
等。技能(PROTOSS_ACTION_ABILITY
)
通过这样的划分,每个单位(或者子 Agent)就可以根据自己的能力选择对应的动作列表,避免出现“农民去放闪现”这种不合法动作。
也让我们在写大模型指令时,更容易阅读和扩展:
PROTOSS_BASIC_ACTION_2 + 部分 Ability_Blink
;PROTOSS_BASIC_ACTION_2 + PROTOSS_ACTION_BUILD
;PROTOSS_ACTION_WARPTRAIN
等。总之,这样做的好处是既保持了代码结构清晰、便于查找,也降低了**LLM“乱指挥”**的可能性(因为每个子 Agent 只会暴露一小部分动作)。
AN_ACTION = {'name': '', 'arg': [], 'func': []}
PROTOSS_BASIC_ACTION_SMAC = [
{'name': 'Attack_Unit', 'arg': ['tag'],
'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},
]
PROTOSS_BASIC_ACTION_SMAC2 = [
{'name': 'Attack_Unit', 'arg': ['tag'],
'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},
{'name': 'Move_Screen', 'arg': ['screen'],
'func': [(331, F.Move_screen, ('queued', 'screen'))]},
]
PROTOSS_BASIC_ACTION_1 = [
{'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]},
{'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]},
{'name': 'Stop_Building', 'arg': [], 'func': [(454, F.Stop_Building_quick, ('queued'))]},
]
PROTOSS_BASIC_ACTION_2 = [
{'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]},
{'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]},
{'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]},
{'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]},
{'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]},
{'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},
]
PROTOSS_BASIC_ACTION_3 = [
{'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]},
{'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]},
{'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]},
{'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]},
{'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]},
]
PROTOSS_ACTION_WARPTRAIN = [
{'name': 'Warp_Adept_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(505, F.TrainWarp_Adept_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_DarkTemplar_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(506, F.TrainWarp_DarkTemplar_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_HighTemplar_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(507, F.TrainWarp_HighTemplar_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_Sentry_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(508, F.TrainWarp_Sentry_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_Stalker_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(509, F.TrainWarp_Stalker_screen, ('queued', 'screen_tag'))]}, # tag for WarpprismPhasing/Pylon
{'name': 'Warp_Zealot_Near', 'arg': ['tag'],
'func': [(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),
(510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')),]}, # tag for WarpprismPhasing/Pylon
]
PROTOSS_ACTION_TRAIN = [
{'name': 'Train_Mothership', 'arg': [], 'func': [(541, F.Train_Mothership_quick, ('queued'))]},
{'name': 'Train_Adept', 'arg': [], 'func': [(457, F.Train_Adept_quick, ('queued'))]},
{'name': 'Train_DarkTemplar', 'arg': [], 'func': [(465, F.Train_DarkTemplar_quick, ('queued'))]},
{'name': 'Train_HighTemplar', 'arg': [], 'func': [(471, F.Train_HighTemplar_quick, ('queued'))]},
{'name': 'Train_Sentry', 'arg': [], 'func': [(491, F.Train_Sentry_quick, ('queued'))]},
{'name': 'Train_Stalker', 'arg': [], 'func': [(493, F.Train_Stalker_quick, ('queued'))]},
{'name': 'Train_Zealot', 'arg': [], 'func': [(503, F.Train_Zealot_quick, ('queued'))]},
{'name': 'Train_Oracle', 'arg': [], 'func': [(482, F.Train_Oracle_quick, ('queued'))]},
{'name': 'Train_Phoenix', 'arg': [], 'func': [(484, F.Train_Phoenix_quick, ('queued'))]},
{'name': 'Train_VoidRay', 'arg': [], 'func': [(500, F.Train_VoidRay_quick, ('queued'))]},
{'name': 'Train_Tempest', 'arg': [], 'func': [(495, F.Train_Tempest_quick, ('queued'))]},
{'name': 'Train_Carrier', 'arg': [], 'func': [(461, F.Train_Carrier_quick, ('queued'))]},
{'name': 'Train_Observer', 'arg': [], 'func': [(481, F.Train_Observer_quick, ('queued'))]},
{'name': 'Train_WarpPrism', 'arg': [], 'func': [(501, F.Train_WarpPrism_quick, ('queued'))]},
{'name': 'Train_Immortal', 'arg': [], 'func': [(473, F.Train_Immortal_quick, ('queued'))]},
{'name': 'Train_Colossus', 'arg': [], 'func': [(462, F.Train_Colossus_quick, ('queued'))]},
{'name': 'Train_Disruptor', 'arg': [], 'func': [(466, F.Train_Disruptor_quick, ('queued'))]},
]
PROTOSS_ACTION_RESEARCH = [
{'name': 'Research_ProtossAirArmor', 'arg': [],
'func': [(381, F.Research_ProtossAirArmor_quick, ('queued'))]},
{'name': 'Research_ProtossAirWeapons', 'arg': [],
'func': [(385, F.Research_ProtossAirWeapons_quick, ('queued'))]},
{'name': 'Research_WarpGate', 'arg': [],
'func': [(428, F.Research_WarpGate_quick, ('queued'))]},
{'name': 'Research_ProtossGroundArmor', 'arg': [],
'func': [(389, F.Research_ProtossGroundArmor_quick, ('queued'))]},
{'name': 'Research_ProtossGroundWeapons', 'arg': [],
'func': [(393, F.Research_ProtossGroundWeapons_quick, ('queued'))]},
{'name': 'Research_ProtossShields', 'arg': [],
'func': [(397, F.Research_ProtossShields_quick, ('queued'))]},
{'name': 'Research_Charge', 'arg': [],
'func': [(359, F.Research_Charge_quick, ('queued'))]},
{'name': 'Research_Blink', 'arg': [],
'func': [(356, F.Research_Blink_quick, ('queued'))]},
{'name': 'Research_AdeptResonatingGlaives', 'arg': [],
'func': [(351, F.Research_AdeptResonatingGlaives_quick, ('queued'))]},
{'name': 'Research_PhoenixAnionPulseCrystals', 'arg': [],
'func': [(379, F.Research_PhoenixAnionPulseCrystals_quick, ('queued'))]},
{'name': 'Research_ExtendedThermalLance', 'arg': [],
'func': [(364, F.Research_ExtendedThermalLance_quick, ('queued'))]},
{'name': 'Research_GraviticBooster', 'arg': [],
'func': [(366, F.Research_GraviticBooster_quick, ('queued'))]},
{'name': 'Research_GraviticDrive', 'arg': [],
'func': [(367, F.Research_GraviticDrive_quick, ('queued'))]},
{'name': 'Research_PsiStorm', 'arg': [],
'func': [(401, F.Research_PsiStorm_quick, ('queued'))]},
{'name': 'Research_ShadowStrike', 'arg': [],
'func': [(404, F.Research_ShadowStrike_quick, ('queued'))]},
]
PROTOSS_ACTION_BUILD = [
{'name': 'Build_Nexus_Near', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (65, F.Build_Nexus_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Assimilator_Near', 'arg': ['tag'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Nexus_Screen', 'arg': ['screen'],
'func': [(65, F.Build_Nexus_screen, ('queued', 'screen'))]},
{'name': 'Build_Assimilator_Screen', 'arg': ['screen'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen'))]},
{'name': 'Build_Pylon_Screen', 'arg': ['screen'],
'func': [(70, F.Build_Pylon_screen, ('queued', 'screen'))]},
{'name': 'Build_Gateway_Screen', 'arg': ['screen'],
'func': [(57, F.Build_Gateway_screen, ('queued', 'screen'))]},
{'name': 'Build_CyberneticsCore_Screen', 'arg': ['screen'],
'func': [(48, F.Build_CyberneticsCore_screen, ('queued', 'screen'))]},
{'name': 'Build_Forge_Screen', 'arg': ['screen'],
'func': [(55, F.Build_Forge_screen, ('queued', 'screen'))]},
{'name': 'Build_PhotonCannon_Screen', 'arg': ['screen'],
'func': [(69, F.Build_PhotonCannon_screen, ('queued', 'screen'))]},
{'name': 'Build_ShieldBattery_Screen', 'arg': ['screen'],
'func': [(525, F.Build_ShieldBattery_screen, ('queued', 'screen'))]},
{'name': 'Build_TwilightCouncil_Screen', 'arg': ['screen'],
'func': [(101, F.Build_TwilightCouncil_screen, ('queued', 'screen'))]},
{'name': 'Build_TemplarArchive_Screen', 'arg': ['screen'],
'func': [(100, F.Build_TemplarArchive_screen, ('queued', 'screen'))]},
{'name': 'Build_DarkShrine_Screen', 'arg': ['screen'],
'func': [(49, F.Build_DarkShrine_screen, ('queued', 'screen'))]},
{'name': 'Build_Stargate_Screen', 'arg': ['screen'],
'func': [(88, F.Build_Stargate_screen, ('queued', 'screen'))]},
{'name': 'Build_FleetBeacon_Screen', 'arg': ['screen'],
'func': [(54, F.Build_FleetBeacon_screen, ('queued', 'screen'))]},
{'name': 'Build_RoboticsBay_Screen', 'arg': ['screen'],
'func': [(81, F.Build_RoboticsBay_screen, ('queued', 'screen'))]},
{'name': 'Build_RoboticsFacility_Screen', 'arg': ['screen'],
'func': [(82, F.Build_RoboticsFacility_screen, ('queued', 'screen'))]},
]
PROTOSS_ACTION_EASY_BUILD = [
{'name': 'Build_Nexus_Near', 'arg': ['tag'],
'func': [(65, F.Build_Nexus_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Assimilator_Near', 'arg': ['tag'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Pylon_Near', 'arg': ['tag'],
'func': [(70, F.Build_Pylon_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Gateway_Near', 'arg': ['tag'],
'func': [(57, F.Build_Gateway_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_CyberneticsCore_Near', 'arg': ['tag'],
'func': [(48, F.Build_CyberneticsCore_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Forge_Near', 'arg': ['tag'],
'func': [(55, F.Build_Forge_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_PhotonCannon_Near', 'arg': ['tag'],
'func': [(69, F.Build_PhotonCannon_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_ShieldBattery_Near', 'arg': ['tag'],
'func': [(525, F.Build_ShieldBattery_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_TwilightCouncil_Near', 'arg': ['tag'],
'func': [(101, F.Build_TwilightCouncil_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_TemplarArchive_Near', 'arg': ['tag'],
'func': [(100, F.Build_TemplarArchive_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_DarkShrine_Near', 'arg': ['tag'],
'func': [(49, F.Build_DarkShrine_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_Stargate_Near', 'arg': ['tag'],
'func': [(88, F.Build_Stargate_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_FleetBeacon_Near', 'arg': ['tag'],
'func': [(54, F.Build_FleetBeacon_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_RoboticsBay_Near', 'arg': ['tag'],
'func': [(81, F.Build_RoboticsBay_screen, ('queued', 'screen_tag'))]},
{'name': 'Build_RoboticsFacility_Near', 'arg': ['tag'],
'func': [(82, F.Build_RoboticsFacility_screen, ('queued', 'screen_tag'))]},
]
PROTOSS_ACTION_ABILITY = [
{'name': 'Morph_WarpPrismPhasingMode', 'arg': [],
'func': [(329, F.Morph_WarpPrismPhasingMode_quick, ('queued'))]},
{'name': 'Morph_WarpPrismTransportMode', 'arg': [],
'func': [(330, F.Morph_WarpPrismTransportMode_quick, ('queued'))]},
{'name': 'Morph_SurveillanceMode', 'arg': [],
'func': [(538, F.Morph_SurveillanceMode_quick, ('queued'))]},
{'name': 'Morph_ObserverMode', 'arg': [],
'func': [(535, F.Morph_ObserverMode_quick, ('queued'))]},
{'name': 'Morph_Archon', 'arg': [],
'func': [(296, F.Morph_Archon_quick, ('queued'))]},
{'name': 'Ability_PrismaticAlignment', 'arg': [],
'func': [(244, F.Effect_VoidRayPrismaticAlignment_quick, ('queued'))]},
{'name': 'Ability_CancelPhaseShift', 'arg': [],
'func': [(453, F.Stop_quick, ('queued'))]},
{'name': 'Ability_GuardianShield', 'arg': [],
'func': [(197, F.Effect_GuardianShield_quick, ('queued'))]},
{'name': 'Ability_PulsarBeamOn', 'arg': [],
'func': [(38, F.Behavior_PulsarBeamOn_quick, ('queued'))]},
{'name': 'Ability_ChronoBoost_Unit', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (527, F.Effect_ChronoBoostEnergyCost_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_NexusMassRecall_Near', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (529, F.Effect_MassRecall_Nexus_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_MothershipMassRecall_Near', 'arg': ['tag'],
'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (208, F.Effect_MassRecall_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_ShadowStride_Unit', 'arg': ['tag'],
'func': [(182, F.Effect_ShadowStride_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_GravitonBeam_Unit', 'arg': ['tag'],
'func': [(196, F.Effect_GravitonBeam_screen, ('queued', 'screen_tag'))]},
{'name': 'Ability_Blink_Screen', 'arg': ['screen'],
'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]},
{'name': 'Ability_AdeptPhaseShift_Screen', 'arg': ['screen'],
'func': [(177, F.Effect_AdeptPhaseShift_screen, ('queued', 'screen'))]},
{'name': 'Ability_PsiStorm_Screen', 'arg': ['screen'],
'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen'))]},
{'name': 'Ability_PurificationNova_Screen', 'arg': ['screen'],
'func': [(219, F.Effect_PurificationNova_screen, ('queued', 'screen'))]},
{'name': 'Ability_ForceField_Screen', 'arg': ['screen'],
'func': [(193, F.Effect_ForceField_screen, ('queued', 'screen'))]},
{'name': 'Ability_TimeWarp_Screen', 'arg': ['screen'],
'func': [(241, F.Effect_TimeWarp_screen, ('queued', 'screen'))]},
{'name': 'Ability_OracleRevelation_Screen', 'arg': ['screen'],
'func': [(214, F.Effect_OracleRevelation_screen, ('queued', 'screen'))]},
{'name': 'Ability_StasisTrap_Screen', 'arg': ['screen'],
'func': [(90, F.Build_StasisTrap_screen, ('queued', 'screen'))]},
{'name': 'Load_Unit', 'arg': ['tag'],
'func': [(287, F.Load_screen, ('queued', 'screen_tag'))]},
{'name': 'Unload_Screen', 'arg': ['screen'],
'func': [(516, F.UnloadAllAt_screen, ('queued', 'screen'))]},
{'name': 'Lock_Nexus_Near', 'arg': ['tag'],
'func': [(70, F.Build_Pylon_screen, ('queued', 'screen_tag'))]},
{'name': 'Lock_Assimilator_Near', 'arg': ['tag'],
'func': [(40, F.Build_Assimilator_screen, ('queued', 'screen_tag'))]},
{'name': 'Select_Unit_Blink_Screen', 'arg': ['tag', 'screen'],
'func': [(3, F.select_rect, ('select', 'screen1_tag', 'screen2_tag')),
(180, F.Effect_Blink_screen, ('queued', 'screen'))]},
]
完整的帮助我分析理解这个内容
下面这份 llm_action.py
文件中,定义了大量与神族 (Protoss) 有关的动作集合 (action space)。它的目的是将 PySC2 底层的“原子操作”(如 FUNCTIONS.Attack_screen
, FUNCTIONS.Move_screen
等)包装成更易读、可直接使用的动作描述,从而方便 LLM(大语言模型)或其他高层逻辑来调度、调用。
从整体上看,这些动作大致分成以下几类:
PROTOSS_BASIC_ACTION_SMAC
与 PROTOSS_BASIC_ACTION_SMAC2
PROTOSS_BASIC_ACTION_1
, PROTOSS_BASIC_ACTION_2
, PROTOSS_BASIC_ACTION_3
PROTOSS_ACTION_WARPTRAIN
PROTOSS_ACTION_TRAIN
PROTOSS_ACTION_RESEARCH
PROTOSS_ACTION_BUILD
与 PROTOSS_ACTION_EASY_BUILD
PROTOSS_ACTION_ABILITY
同时还有一个最上方的“标准动作对象” AN_ACTION
(示例结构),以及注释中提到的单个动作的属性:
pythonAN_ACTION = {'name': '', 'arg': [], 'func': []}
name
:动作名称(例如 'Attack_Unit'
)arg
:该动作需要的参数类型(如 ['tag']
, ['screen']
, ['minimap']
)func
:一个或多个 (func_id, func_call, arg_tuple)
三元组,对应 PySC2 的具体操作接下来,我会从上到下,结合每个动作列表的定义,详细分析这些动作是做什么用的、以及各字段的意义。
PROTOSS_BASIC_ACTION_SMAC
/ PROTOSS_BASIC_ACTION_SMAC2
这两个列表是为了 SMAC (StarCraft Multi-Agent Challenge) 这种小规模对战环境而准备的。SMAC 通常只需要简单的移动、攻击等动作来做部队微操,不涉及建造/科研等宏观操作。
PROTOSS_BASIC_ACTION_SMAC
pythonPROTOSS_BASIC_ACTION_SMAC = [ { 'name': 'Attack_Unit', 'arg': ['tag'], 'func': [ (12, F.Attack_screen, ('queued', 'screen_tag')) ] }, # 还注释了一个 Select_Unit_Attack_Unit 的例子 ]
Attack_Unit
:允许指定一个单位 tag
,然后将该 tag
转换为屏幕坐标(screen_tag
),执行 Attack_screen
动作 (func_id=12),即让当前选中的部队去攻击那个坐标对应的单位。PROTOSS_BASIC_ACTION_SMAC2
pythonPROTOSS_BASIC_ACTION_SMAC2 = [ { 'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))] }, { 'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))] } # 其他注释掉的是先选中再攻击/移动的更细动作 ]
Attack_Unit
以外,增加了 Move_Screen
:将屏幕坐标 [x, y]
传入 FUNCTIONS.Move_screen
(func_id=331),从而让单位移动到目标点。PROTOSS_BASIC_ACTION_1
, PROTOSS_BASIC_ACTION_2
, PROTOSS_BASIC_ACTION_3
这是针对完整 StarCraft II 情境下,将神族的各种单位/建筑分出三类——能否攻击、是否是建筑、是否只能移动 等等,以便做可复用的“基础操作集”。
PROTOSS_BASIC_ACTION_1
(常用于建筑/不移动的单位)pythonPROTOSS_BASIC_ACTION_1 = [ {'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]}, {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Stop_Building', 'arg': [], 'func': [(454, F.Stop_Building_quick, ('queued'))]}, ]
Stop
/ Stop_Building
:停止当前行动、取消建筑的排队动作。func_id=453 / 454 对应 Stop_quick
、Stop_Building_quick
。No_Operation
:无操作,用于占位或忽略。这一般给建筑 (比如 Forge, CyberneticsCore) 或者不移动/不攻击的单位使用。
PROTOSS_BASIC_ACTION_2
(能进攻的常规作战单位)pythonPROTOSS_BASIC_ACTION_2 = [ {'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]}, {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]}, {'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]}, {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]}, {'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]}, ]
Stalker
, Zealot
, Immortal
等都可使用这些。PROTOSS_BASIC_ACTION_3
(不能攻击的辅助单位)pythonPROTOSS_BASIC_ACTION_3 = [ {'name': 'Stop', 'arg': [], 'func': [(453, F.Stop_quick, ('now'))]}, {'name': 'No_Operation', 'arg': [], 'func': [(0, F.no_op, ())]}, {'name': 'Hold_Position', 'arg': [], 'func': [(274, F.HoldPosition_quick, ('queued'))]}, {'name': 'Move_Minimap', 'arg': ['minimap'], 'func': [(332, F.Move_minimap, ('queued', 'minimap'))]}, {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]}, ]
Attack_Unit
,因为这类单位(如 Observer、Warp Prism 等)本身没有武器,只能移动和驻守。PROTOSS_ACTION_WARPTRAIN
神族有 “折跃水晶塔 (Pylon) + 折跃门 (WarpGate)” 的特殊造兵方式。“WarpTrain” 动作集就是为此准备。
pythonPROTOSS_ACTION_WARPTRAIN = [ { 'name': 'Warp_Adept_Near', 'arg': ['tag'], 'func': [ (8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')), (505, F.TrainWarp_Adept_screen, ('queued', 'screen_tag')) ] }, { 'name': 'Warp_Zealot_Near', 'arg': ['tag'], 'func': [ (8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')), (510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')), (510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')), (510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')), (510, F.TrainWarp_Zealot_screen, ('queued', 'screen_tag')), ] } # 类似的 Warp_Stalker_Near, Warp_Sentry_Near 等 ]
select_warp_gates
(func_id=8),然后移动相机到指定 tag
(func_id=573 自定义的摄像机移动),最后对屏幕上 tag
附近的位置执行 TrainWarp_xxx_screen
。Warp_Adept_Near(0x12345)
就是去 0x12345 这个单位(多半是 Pylon 或 Warp Prism)的附近折跃一个 Adept。(510, F.TrainWarp_Zealot_screen, ...)
,表示连续折跃四个狂热者 (Zealot),可能是一次性批量造兵。PROTOSS_ACTION_TRAIN
pythonPROTOSS_ACTION_TRAIN = [ # Nexus {'name': 'Train_Mothership', 'arg': [], 'func': [(541, F.Train_Mothership_quick, ('queued'))]}, # Gateway {'name': 'Train_Adept', 'arg': [], 'func': [(457, F.Train_Adept_quick, ('queued'))]}, {'name': 'Train_Stalker', 'arg': [], 'func': [(493, F.Train_Stalker_quick, ('queued'))]}, {'name': 'Train_Zealot', 'arg': [], 'func': [(503, F.Train_Zealot_quick, ('queued'))]}, # Stargate {'name': 'Train_Oracle', 'arg': [], 'func': [(482, F.Train_Oracle_quick, ('queued'))]}, {'name': 'Train_Carrier', 'arg': [], 'func': [(461, F.Train_Carrier_quick, ('queued'))]}, # RoboticsFacility {'name': 'Train_Immortal', 'arg': [], 'func': [(473, F.Train_Immortal_quick, ('queued'))]}, {'name': 'Train_Colossus', 'arg': [], 'func': [(462, F.Train_Colossus_quick, ('queued'))]}, # ... ]
Train_xxx
对应在 pysc2 里 Train_xxx_quick
的动作 ID,比如 Train_Zealot_quick
ID=503。<Train_Stalker()>
就可以了,在后台会把它转成 (493, F.Train_Stalker_quick, ('queued'))
。add_func_for_train_and_research
) 来自动完成“选中闲置建筑、移动相机” 等前置操作。PROTOSS_ACTION_RESEARCH
pythonPROTOSS_ACTION_RESEARCH = [ {'name': 'Research_WarpGate', 'arg': [], 'func': [(428, F.Research_WarpGate_quick, ('queued'))]}, {'name': 'Research_Blink', 'arg': [], 'func': [(356, F.Research_Blink_quick, ('queued'))]}, {'name': 'Research_Charge', 'arg': [], 'func': [(359, F.Research_Charge_quick, ('queued'))]}, {'name': 'Research_PsiStorm', 'arg': [], 'func': [(401, F.Research_PsiStorm_quick, ('queued'))]}, # ... ]
func
里常见 ('queued')
,表示这个研究指令会排在已有指令队列之后进行。PROTOSS_ACTION_BUILD
与 PROTOSS_ACTION_EASY_BUILD
PROTOSS_ACTION_BUILD
(标准建造动作)pythonPROTOSS_ACTION_BUILD = [ { 'name': 'Build_Nexus_Near', 'arg': ['tag'], 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (65, F.Build_Nexus_screen, ('queued', 'screen_tag'))] }, { 'name': 'Build_Nexus_Screen', 'arg': ['screen'], 'func': [(65, F.Build_Nexus_screen, ('queued', 'screen'))] }, { 'name': 'Build_Pylon_Screen', 'arg': ['screen'], 'func': [(70, F.Build_Pylon_screen, ('queued', 'screen'))] }, # 还有建造气矿、兵营、各种科技建筑的动作 ]
Build_Nexus_Near(tag)
:先移动相机到指定 tag
附近,然后再在屏幕上 (tag 所在处) 调用 Build_Nexus_screen
。多用于“在某个位置(矿点)附近平铺主基地”。Build_Nexus_Screen([x,y])
:直接在屏幕坐标 [x,y]
下达建造命令。用于更精确地摆放建筑。PROTOSS_ACTION_EASY_BUILD
(简化版建造)pythonPROTOSS_ACTION_EASY_BUILD = [ {'name': 'Build_Nexus_Near', 'arg': ['tag'], 'func': [(65, F.Build_Nexus_screen, ('queued', 'screen_tag'))]}, # ... 省略 ... ]
(573, F.llm_pysc2_move_camera, ...)
,也不做额外检查。tag
附近建就行。PROTOSS_ACTION_ABILITY
最后这部分是神族丰富的单位技能——闪现(Blink)、力场(Force Field)、灵能风暴(PsiStorm)、卫戍之盾(GuardianShield)、折跃棱镜形态切换、棱镜对空束、守护者之盾、传送回城等等。
pythonPROTOSS_ACTION_ABILITY = [ {'name': 'Morph_WarpPrismPhasingMode', 'arg': [], 'func': [(329, F.Morph_WarpPrismPhasingMode_quick, ('queued'))]}, {'name': 'Morph_Archon', 'arg': [], 'func': [(296, F.Morph_Archon_quick, ('queued'))]}, {'name': 'Ability_Blink_Screen', 'arg': ['screen'], 'func': [(180, F.Effect_Blink_screen, ('queued', 'screen'))]}, {'name': 'Ability_PsiStorm_Screen', 'arg': ['screen'], 'func': [(218, F.Effect_PsiStorm_screen, ('queued', 'screen'))]}, {'name': 'Ability_ForceField_Screen', 'arg': ['screen'], 'func': [(193, F.Effect_ForceField_screen, ('queued', 'screen'))]}, {'name': 'Ability_NexusMassRecall_Near', 'arg': ['tag'], 'func': [(573, F.llm_pysc2_move_camera, ('world_tag')), (529, F.Effect_MassRecall_Nexus_screen, ('queued', 'screen_tag'))]}, # ... ]
Ability_Blink_Screen([x, y])
:让已选中的带闪现技能的单位(Stalker)闪现到屏幕坐标 (x, y)
。Ability_NexusMassRecall_Near(tag)
:把相机移动到 tag
所在位置,然后对那块区域使用主基地的大规模召回技能。Morph_WarpPrismPhasingMode
:切换折跃棱镜到相位模式,使其成为“可折跃造兵”的能量点。在这些动作里,你会频繁看到 arg
被设置为 ['tag']
, ['screen']
, ['minimap']
等等,这其实对应了不同类型的参数:
tag
:单位或建筑在 SC2 引擎中的唯一 ID (十六进制 0x....
)。
func
中写 'screen_tag'
或 'world_tag'
,表示会将这个 tag 转换成屏幕坐标或世界坐标,然后再执行相应的操作。screen
:一个 [x, y]
格式的屏幕坐标。
Move_screen
, Attack_screen
, Build_xxx_screen
, Effect_xxx_screen
等。minimap
:一个 [x, y]
格式的小地图坐标。
Move_minimap
.额外函数:比如 (573, F.llm_pysc2_move_camera, ('world_tag'))
是自定义的摄像机移动操作,会将 tag
转成世界坐标,然后将镜头对准那里。
很多动作其实是多个 func_triple
拼接在一起,例如:
python(8, F.select_warp_gates, ('select')), (573, F.llm_pysc2_move_camera, ('world_tag')), (509, F.TrainWarp_Stalker_screen, ('queued', 'screen_tag'))
表示“先选中所有空闲折跃门 → 移动相机到某单位位置 → 再对那个位置使用 TrainWarp_Stalker
”。
文件功能:
Build_Pylon_Screen
、Train_Stalker
、Ability_Blink_Screen
等)来表示 PySC2 中复杂的 “(func_id, func_call, args)” 操作。<Build_Pylon_Screen([10,20])>
就能让后台翻译器执行正确的 (70, F.Build_Pylon_screen, ('queued',(10,20)))
。分门别类:
执行机制:
select_point
选单位、llm_pysc2_move_camera
移动镜头)来顺利执行。<Train_Stalker()>
,程序会先自动找到一个空闲 Gateway,先移动镜头过去并选中它,再执行 (493, F.Train_Stalker_quick, ('queued'))
。通过这样的封装,大大简化了大模型或高层 AI 编写指令的难度,让它们只用关心“做什么”而不用关心“怎么在 PySC2 里找对应 ID、坐标”,也避免了“农民去用闪现”这类无效操作的出现。
Visual-CoT: A Visual Guided Chain of Thought Framework
for Fine-grained Control in StarCraft II Actions
Yanan Nia∗
, Runnan Qia
, Zongyuan Lib
, Kuihua Huanga
, Lumin Jianga
, Xiaojie Xub
,
Guo Xianb
, and Xuebo Zhangb
aLaboratory for Big Data and Decision, National University of Defense Technology, Changsha,
China
bCollege of Artificial Intelligence, Nankai University, Tianjin, China
ABSTRACT
In recent years, large language models (LLMs) have made significant progress in the field of natural language
processing. However, fine-grained action control in complex environments remains a significant challenge. To
address this issue, we propose a framework called Visual-CoT (Visual guided Chain of Thought), which aims to
enhance the ability of LLMs to perform fine-grained action control in complex strategy games such as StarCraft
II. This framework integrates visual information with the chain of thought, allowing visual data to seamlessly
interact with the reasoning and decision-making processes of LLMs. We implement Visual-CoT on the LLMPySC2 environment (the Large Language Model StarCraft II Learning Environment) and validate it on a map
featuring high-ground terrain information, simulating a scenario where one Colossus faces thirty-two Zerglings.
Experimental results show that the Visual-CoT framework improves the LLM’s win rate to 45% (winning 9 out
of 20 games), significantly enhancing fine-grained control and precise action execution capabilities. The code is
open-sourced and available at https://github.com/Flycat-Tom/Visual-COT.
Keywords: large language models, Visual-CoT, fine-grained action control, StarCraft II, strategy games
在阅读并梳理了整篇论文后,可以发现它围绕 Visual-CoT(Visual guided Chain of Thought)这一核心理念,探讨了如何将视觉信息与**推理过程(CoT)**相结合来应对星际争霸 II (SC2) 中的微操与策略决策挑战。本文的思路在于:
实验结果证明,Visual-CoT 能有效提升在微操上的精细度和对地形、敌我分布的理解,从而将胜率从 0% 提升到 45%。同时,作者也做了消融实验(去掉图像或去掉 CoT)来证实视觉信息与推理链的重要性。
结合论文内容,以下是一些可进一步改进和完善的想法与方向,供参考:
现有实验只针对 1 Colossus vs. 32 Zerglings 这一特殊地图。
通过在更复杂的环境和单位组合上测试,可以验证该方法的泛化能力,确保并非只在一个特定场景内有效。
目前论文的核心实验均是单个单位(巨像)在进行微操。
这会大大增加决策复杂度,也能更全面体现 Visual-CoT 与 CoT 的高层推理能力。
论文中,视觉信息主要通过屏幕截图 + 简单边界标记的方式来辅助模型决策:
从而让“视觉指导”不仅仅是“给模型看看截图”,而是真正地自动理解并标注地图,并在“思维链”中更有效地使用。
论文中,每秒抓一张图(对应 SC2 中约 22.4 帧),做一次推理。在实际游戏中,这可能存在一定延迟或超时导致决策滞后。
当前论文更多地展示了单向的“视觉输入 + CoT 推理 → 动作输出”。如果想进一步提升模型自主性:
论文中只对比了 LLM-PySC2 Baseline 与有/无视觉或有/无 CoT 的不同版本,没有与强化学习、进化算法、传统脚本AI 等对比:
Chain of Thought 常被用于一次性推理,然后输出结果。但在 RTS 游戏中,状态持续变化,需要多回合或多步迭代:
论文已经开源了 https://github.com/Flycat-Tom/Visual-COT 的代码,这对社区很有帮助。若要进一步推广,需要在工程和可扩展性上做更多工作:
本文提出的 Visual-CoT 思路将视觉感知与大模型的链式推理相结合,在特定场景(1 Colossus vs 32 Zerglings)取得了显著的操作精度和胜率提升。若要进一步完善并强化该方法,可以从 多场景泛化、多单位/多智能体、更丰富的视觉处理、实时性与交互、与其他AI方法混合 以及工程可扩展性 等方面展开后续研究。
如能在这些方向上做更深入的尝试,相信会让 Visual-CoT 在星际争霸 II 乃至更广泛的复杂策略游戏中,展现出更强大的决策能力和实用价值。
能不能有多思维链,自动根据战场情况选择不同的思考过程呢?
确实可以设计一种**“多思维链”(Multiple Chains of Thought, Multi-CoT)的思路,让模型根据当前战场情况或任务需求**,自动切换到合适的推理过程,从而得到更具针对性和效率的决策。这里给出几个可能的实现思路和框架,供参考:
这样就形成了多条思维链(微操链、宏观链、侦察链……),并由高层链来做协调或选择,只在需要时激活。
“现在需要更详细地分析微操,调用
<Micro_CoT>
”
上下文管理:
执行速度和资源消耗:
优先级冲突:
要实现多条思维链并让模型自动根据战场情况选择,核心在于:
这些思路如果与 Visual-CoT 类似的多模态输入相结合(把视觉信息也融入“多链”中),就可以让不同链在不同情境下更“专注”某些视觉特征,大大提升智能体在 RTS 游戏中的决策适配性与操作精细度。
因此,**“多思维链 + 动态选择”**完全是可行且有前景的扩展方向,可以让游戏 AI 在面对复杂多变的星际争霸 II 对局时更灵活、更高效地做出正确决定。
因为我方是巨像,可以利用空间的悬崖优势对敌人放风筝,利用远程攻击敌人。结合多思维链选择给我一些idea。
结合“巨像可跨越悬崖、远程输出”的特点,以及你提到的“多思维链 (Multi-CoT)”思路,可以考虑把对巨像的作战逻辑拆分成多个可选的思维链,每条链各自关注不同的关键点,然后由“路由决策”或“上层调度”在游戏过程中动态选择、切换或合并。以下给出几个可行的思路,以供参考:
可以先从功能或场景两个维度去拆分,常见的方案有:
进攻链 (Offense Chain)
防守链 (Defense/Escape Chain)
机动链 (Mobility Chain)
牵制链 (Kite/Foxtrot Chain)
侦查链 (Scout Chain) (可选)
在不同时刻,这些思维链可能同时产生“移动/攻击”指令,但各有优先级。下面就是一个可能的“动态选择”思路。
这类规则/判定也可以由一个“调度 CoT”来自动对环境做出判断,输出“调用何种子链”的指令。
“牵制链”是巨像最核心的打法之一。其典型思路是:
初始位置评估:
火力输出:
<Attack_Unit(xxx)>
;后退或跳崖:
<Move_Screen([x', y'])>
<Move_Screen>
到悬崖对面。反复循环:
在这个思维链里,可以细化一些“距离阈值”“血量阈值”之类的判断,让放风筝逻辑更智能。
假设每次游戏循环(比如每秒或每若干帧)都要做一次高级决策,可伪代码如下:
pythondef high_level_decision(env_state): # env_state 包含了:巨像位置、敌方Zergling分布、地图悬崖信息等 # (A) 先做一些全局判断 if env_state.colossus_health < 0.3 or env_state.zerglings_very_close: selected_chain = "Defense_Chain" # 优先防守后撤 elif env_state.colossus_weapon_cooldown <= 0: # 有机会输出 selected_chain = "Offense_Chain" else: # 默认机动或放风筝 selected_chain = "Kiting_Chain" # (B) 路由到对应思维链,完成更细致的 CoT 推理 if selected_chain == "Defense_Chain": return defense_chain_reasoning(env_state) elif selected_chain == "Offense_Chain": return offense_chain_reasoning(env_state) else: return kiting_chain_reasoning(env_state)
defense_chain_reasoning()
、offense_chain_reasoning()
、kiting_chain_reasoning()
分别代表不同的思维链。<Move_Screen([120, 80])>
, <Attack_Unit(0x12345)>
),就输出给游戏环境执行。悬崖周游 (Harass):
异常事件处理:
多兵种协同:
可视化自我解释:
通过这种多思维链 + 动态调度的框架,可以让巨像(甚至其他可跨越地形或有远程优势的单位)在星际争霸 II 中更灵活地利用悬崖、高地等地形实现风筝和高效输出,大幅提升实战中的生存力与杀伤力。
第一:请你看看我原来的提示词,
第二:巨像攻击小狗没有侦查,就是远程攻击就行,不要被靠近
第三:结合多思维链思想,给出一些调整思路
提示词内容如下:
class BasePrompt:
textdef __init__(self): self.sp = '' self.eip = '' self.eop = '' self.screen_img_rgb_prompt = '' self.screen_img_fea_prompt = '' self.minimap_img_rgb_prompt = '' self.minimap_img_fea_prompt = ''
class CombatGroupPrompt(BasePrompt):
textdef __init__(self, name, log_id, config): super(CombatGroupPrompt, self).__init__() self.name = name self.config = config self.log_id = log_id # Part 1 self.sp = f""" 1. Role Description You are a {self.config.AGENTS[self.name]['describe']}. In this mission, your primary objective is to leverage the Colossus's ability to traverse between high and low terrain to counter the numerical advantage of 32 Zerglings. Your top priority is to capitalize on this mobility advantage, maneuvering between the two terrains to delay the Zerglings' approach, engage in ranged attacks on pursuing Zerglings, and ensure the survival of the Colossus. 2.Map Information 2.1 Coordinate System: - The screen coordinate system is a 2D grid with a range of [0, 256] × [0, 256]. - Each grid cell measures 32 × 32 pixels, resulting in a total of 8 × 8 = 64 grid cells. - The origin point (0, 0) is at the top-left corner, while (256, 256) is at the bottom-right corner. 2.2 Terrain Information: - Pathway (connecting the high ground and low ground): The map consists of an upper platform (high ground) and a lower platform (low ground), connected by a narrow pathway on the left side. - High Ground (light green area): The high ground is characterized by light green grass textures and is located on the upper platform. - Low Ground (dark gray area): The low ground is characterized by dark gray concrete textures and is located on the lower platform. - Impassable Area (black background): The edges of the map consist of a black background, which is impassable. The Colossus must avoid entering these areas. - Cliff (light gray vertical surface): The cliff separates the high ground and low ground. The Colossus can freely traverse the cliff, but Zerglings can only access the high ground through the narrow pathway on the left. Only the boundary between the high ground and low ground is considered a cliff; other deep red boundaries are impassable edges of the map. - Boundary: The boundaries between the high ground and the black background, as well as between the low ground and the black background, are considered impassable edges of the map. The boundaries are represented by deep red lines on the map. 3. Key Victory Points: 3.1 Maintain Maximum Attack Range: - Ideally, the Colossus should stay within its maximum attack range (range 7) to fully utilize its long-range attack advantage while minimizing the risk of melee attacks from enemies. 3.2 Focus Fire on Dense Enemy Clusters: - Target the center of the densest enemy groups to maximize splash damage and quickly reduce enemy numbers. 3.3 Cliff Barrier Delay Effect: The Cliff Barrier Delay Effect refers to leveraging the terrain characteristics of cliffs, where enemy units cannot directly cross the cliff, while allied units can freely traverse it. This forces the enemy to take a detour when transitioning between high and low ground, buying time for allied units to reposition, cool down weapons, and strike effectively. - When enemies are heavily concentrated on the high ground: - If allied units are also on the high ground but risk being surrounded, it is recommended to retreat to the bottom-right low ground. This utilizes the cliff as a barrier to delay enemy pursuit, enabling ranged attacks on the enemies while maintaining battlefield control. - When enemies are heavily concentrated on the low ground: - If allied units are also on the low ground but risk being surrounded, it is recommended to retreat to the top-right high ground. The cliff barrier forces enemies to take a detour, providing time for allied units to reposition and counterattack effectively.
4. Decision Process
Based on the observed data of the Colossus and Zerglings, as well as the RGB image of the screen, use the following decision rules to determine action priorities:
text4.1 Condition Assessment Rules: - IS_HIGH_GROUND: - Criteria: If the Colossus is on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by verifying the Colossus's position in the green area via RGB image and observation data. - IS_ZERG_HIGH_GROUND: - Criteria: If the Zerglings are on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by analyzing the Zerglings' positions via RGB image. - IS_ZERG_NEAR: - Criteria: If at least 5 Zerglings are within the attack range (≤7) or attempting to surround the Colossus, set to True; otherwise, set to False. - Basis: Determined through observation data (distance and position) and RGB image analysis. - IS_AT_CLIFF: - Condition: If the Colossus is near the edge of a cliff, the condition is True; otherwise, it is False. It is important to distinguish between a cliff and a boundary. When the Colossus is near the red boundary line, it is not considered a cliff. A cliff is the transitional area between the highland and the lowland. - Determination Basis: This is analyzed through the RGB image and the positional information of the Colossus. - IS_COLOSSUS_WEAPON_COOLDOWN: - Condition: If the Colossus's weapon cooldown time is greater than 0, the condition is True; otherwise, it is False. - Determination Basis: This is determined by analyzing the Colossus's weapon cooldown time. - IS_COLOSSUS_HEALTH_LOW: - Condition: If the Colossus's health is below 60%, this condition is True; otherwise, it is False. - Determination Basis: Determined by observing the Colossus's health. 4.2 Action Decision Rules - Special Scenarios: rule1 :As long as the weapon is in cooldown, the priority of movement actions is higher than attack actions. Movement > Attack. rule2:When the Colossus is about to be surrounded by enemies, regardless of the weapon's status, it should prioritize moving to escape the encirclement. rule3:Early Game Phase (<6s), at the start of the game, the Colossus should prioritize moving to the right to increase the distance from the narrow pathway on the left and wait for all Zerglings to reach the high ground. During this phase, movement takes priority over attacking. - Situation 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should immediately attack enemies, leveraging its ranged advantage to maximize damage. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should create distance from the enemies to avoid being approached and attack once the weapon cooldown ends. - Situation 2: The Colossus and Zerglings are in the same area - Enemies are near and about to attack (IS_ZERG_NEAR = True, Zerglings approaching): - Priority: Move. Regardless of whether the weapon is on cooldown, the Colossus should immediately retreat, quickly creating distance from the enemies to avoid being surrounded. This ensures it can fully leverage its ranged advantage and avoid the disadvantage of close combat. Once out of danger, it can then look for opportunities to counterattack. - Enemies are not near (IS_ZERG_NEAR = False): - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should take the opportunity to attack enemies and weaken their forces. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should reposition to maintain a safe distance while waiting for the weapon cooldown to end. - Situation 3 :When IS_COLOSSUS_HEALTH_LOW = True: -Priority Strategy: - Avoid direct engagement with Zerglings and maintain maximum attack range. - Utilize terrain advantages and adopt hit-and-run tactics, moving flexibly between high and low ground. - Priority: Movement > Attack
5 Integrated Strategy and Decision Framework
1. Strategy Suggestions
text1.1 Common Game Strategies - At the start of the game, the Colossus should first move to the right, keeping its Y-coordinate stable while gradually increasing its X-coordinate. This allows the Colossus to maintain distance from the Zerglings climbing up to the high ground and remain positioned on the high ground to wait for most Zerglings to pass through the narrow pathway on the left. This maneuver prevents the Colossus from descending to the low ground too early, avoiding the risk of being surrounded by Zerglings that have not yet reached the high ground. - Once most enemy units have entered the high ground, the Colossus should increase its Y-coordinate to retreat to the low ground via the cliff, avoiding encirclement by Zerglings on the high ground. 1.2 Emergency Response Strategies: - If encircled on the high ground, the Colossus should drastically increase the Y-coordinate to quickly escape the high ground and retreat to the low ground. - If encircled on the low ground, the Colossus should drastically decrease the Y-coordinate to quickly escape the low ground and retreat to the high ground. 2. Movement and Attack Decision 2.1 When Movement Is Required: - Phase Judgment: - Determine the current phase based on game time and the positions of both allied and enemy units. - Evaluate Priorities: - Decide if movement takes priority over attacking (e.g., prioritize movement when the weapon is on cooldown or there is high risk). - Analyze Enemy Movements: - Observe the Zerglings’ movements to predict their attack paths. - Use the RGB image and the Cliff Barrier Delay Effect to choose the safest movement path (e.g., move down-right or up-right). - Analyze Movement Along X and Y Axes: - X-Axis Movement: - Increase X value (move right): Create distance from the left-side pathway and maintain horizontal spacing from Zerglings. - Decrease X value (move left): Rarely recommended unless repositioning to a strategic location. - Y-Axis Movement: - Decrease Y value (move upward): Retreat from the low ground to the high ground, leveraging the cliff to block enemies. - Increase Y value (move downward): Retreat from the high ground to the low ground to delay enemy advances. 2.2 When Attacking Is Required: - Target Selection: - Prioritize the densest groups of enemies to maximize splash damage. - Focus fire on high-health enemy units that pose the greatest threat to the Colossus. - Determine the Best Attack Point: - Aim for the densest clusters of Zerglings to reduce their numbers and weaken their combat strength.
6.Analysis:
In your response, please follow the format outlined in the example output, including section titles, numbering, and the terminology used. Ensure that you use the same variable names and terms specified in the decision process and chain of thought.
textThe output should include: 1. Decision Process 1.1 Condition Assessment: - `IS_COLOSSUS_HIGH_GROUND = True/False` - `IS_ZERGLING_HIGH_GROUND = True/False` - `IS_ZERGLING_NEAR_COLOSSUS = True/False` - `IS_COLOSSUS_AT_CLIFF = True/False` - `IS_COLOSSUS_WEAPON_COOLDOWN = True/False` - `IS_COLOSSUS_HEALTH_LOW = True/False' 1.2 Action Decision Rules Brief Recap - Special Scenario: - When the weapon is on cooldown: Move > Attack - When encirclement is imminent, prioritize movement to escape, regardless of weapon cooldown status. - Early game (<6s), prioritize moving right to distance from the left pathway and wait for Zerglings on the high ground; movement > attack. - Scenario 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown: Prioritize attacking, leveraging range advantage to maximize damage. - Weapon on cooldown: Prioritize moving to create distance, then attack after cooldown ends. - Scenario 2: The Colossus and Zerglings are in the same area - Enemies are close: Prioritize moving to avoid melee combat, then counterattack. - Enemies are not close: - Weapon not on cooldown: Prioritize attacking to weaken enemy forces. - Weapon on cooldown: Prioritize moving to maintain a safe distance. - Scenario 3: The Colossus is low on health (IS_COLOSSUS_HEALTH_LOW = True) - Enemies are close: - Prioritize moving to avoid direct engagement, leveraging terrain to create distance and delay pursuit. - Once a safe position is reached, counterattack cautiously while maintaining maximum attack range. - Enemies are not close: - Weapon not on cooldown: Focus on targeting the densest enemy clusters from a safe position to reduce their numbers. - Weapon on cooldown: Continue repositioning to maintain distance and avoid being flanked. 1.3 Judgment Based on 4.2 Action Decision Rules: -Provide movement and attack priorities, e.g., attack > move or move > attack. -Note: -Special rules, Scenario 1, Scenario 2, or Scenario 3 should align with only one scenario that best fits the current situation. -Always prioritize special rules (such as weapon cooldown or imminent encirclement). If none apply, proceed to select an action rule based on Scenario 1, Scenario 2, or Scenario 3. 2 Integrated Strategy and Decision Framework 2.1 Strategy Suggestions - Current Phase Determination: Dynamically determine the strategy based on time, the Colossus’s position, and Zergling distribution. Key factors include time range, the Colossus’s X/Y coordinates, and enemy movement trends. - Consider Common Action Recommendations:: - Early Game Observation Phase: Initially, keep the X-coordinate steady while gradually increasing the Y-coordinate to monitor enemy movements. Utilize the Colossus’s range advantage to weaken enemy forces. - Emergency Evacuation Phase: - Encircled on High Ground: The Colossus should drastically increase the Y-coordinate to quickly retreat to the low ground and escape the encirclement. - Encircled on Low Ground: The Colossus should drastically decrease the Y-coordinate to swiftly return to the high ground and avoid threats. - Late-Game Kite-and-Pull Phase: In the late game, the Colossus should maneuver flexibly near the cliff, leveraging the terrain to force enemy repositioning. Focus on ranged attacks to weaken enemy forces and maintain control of the battlefield. 2.2 Movement Decision Analysis - Enemy Approach Path Prediction: Analyze the primary approach paths of enemy units and identify potential threats. - Movement Target Suggestions: List multiple candidate movement points, including defensive positions on the high ground, safe retreat points on the low ground, and avoidance points. Specify the most suitable movement target (e.g., a safe spot on the high ground or near the cliff on the low ground). Explain the reasoning behind the selection (e.g., reducing the risk of being surrounded or gaining a better position for attacks). 2.3 Attack Decision Analysis - Target Prioritization: Prioritize attacking dense enemy clusters to maximize splash damage. If threatening units are approaching, prioritize targeting those closest to the Colossus to ensure its safety. - Attack Target Selection: Identify multiple candidate attack points (e.g., the center of a dense enemy group or the closest group to the Colossus). Specify the optimal attack point and provide reasoning (e.g., to maximize damage or avoid being overwhelmed). 7.Action Output: Team Colossus-1: Action steps. """ self.eip = \ """ Game Info Time: 0:05 Team Colossus-1 Info: Team minimap position: [31, 27] Controlled Team Units: Unit: Colossus, Tag: 0x100000001, ScreenPos: [150, 123], Health: 350 (100%), Weapon Cooldown Time: 0.91s Nearby Enemy Units: Unit: Zergling, Tag: 0x1002c0001, ScreenPos: [52, 164], Distance: 10, Health: 6 (17%) Unit: Zergling, Tag: 0x100180001, ScreenPos: [79, 159], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100440001, ScreenPos: [112, 140], Distance: 4, Health: 35 (100%) Unit: Zergling, Tag: 0x100640001, ScreenPos: [38, 176], Distance: 12, Health: 21 (60%) Unit: Zergling, Tag: 0x100700001, ScreenPos: [52, 178], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100780001, ScreenPos: [42, 169], Distance: 11, Health: 6 (17%) Unit: Zergling, Tag: 0x100540001, ScreenPos: [49, 171], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100500001, ScreenPos: [91, 146], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100800001, ScreenPos: [83, 147], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100280001, ScreenPos: [96, 151], Distance: 6, Health: 21 (60%) Unit: Zergling, Tag: 0x1000c0001, ScreenPos: [59, 162], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x100040001, ScreenPos: [93, 157], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100340001, ScreenPos: [67, 172], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x1001c0001, ScreenPos: [72, 161], Distance: 8, Health: 6 (17%) Unit: Zergling, Tag: 0x100600001, ScreenPos: [76, 152], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100140001, ScreenPos: [86, 162], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x1004c0001, ScreenPos: [117, 145], Distance: 3, Health: 35 (100%) Unit: Zergling, Tag: 0x100400001, ScreenPos: [66, 164], Distance: 9, Health: 21 (60%) Unit: Zergling, Tag: 0x100200001, ScreenPos: [78, 166], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100240001, ScreenPos: [104, 144], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100100001, ScreenPos: [59, 172], Distance: 10, Health: 35 (100%) Unit: Zergling, Tag: 0x1003c0001, ScreenPos: [86, 154], Distance: 7, Health: 6 (17%) Unit: Zergling, Tag: 0x100480001, ScreenPos: [46, 159], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x1006c0001, ScreenPos: [69, 155], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100580001, ScreenPos: [59, 154], Distance: 9, Health: 6 (17%) Unit: Zergling, Tag: 0x100740001, ScreenPos: [130, 134], Distance: 2, Health: 35 (100%) Unit: Zergling, Tag: 0x1005c0001, ScreenPos: [97, 142], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100380001, ScreenPos: [39, 183], Distance: 12, Health: 35 (100%) Unit: Zergling, Tag: 0x100080001, ScreenPos: [106, 151], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100300001, ScreenPos: [124, 137], Distance: 2, Health: 35 (100%) Relevant Knowledge: Protoss.Colossus The large quad-legged vehicle fires lasers in a splash pattern well-suited to destroying swarms of weaker units. This unit can also traverse differences in terrain height due to its long legs, and will appear to step over ledges and other obstacles due to the inverse kinematics system. Unit properties: ['ground', 'air', 'armored', 'massive', 'mechanical'] Weapon info: Attack Range 7, target: ['ground'], anti: ['light'], DPS(damage per second) 13, DPS-anti 20 Unit abilities: Zerg.Zergling Fast but weak melee attacker ideal for swarming attacks in large numbers. Unit properties: ['ground', 'light', 'biological'] Weapon info: Attack Range 1, target: ['ground'], DPS(damage per second) 10 Valid Actions: Team Colossus-1 Valid Actions: <Attack_Unit(tag)> <Move_Screen(screen)> Action Args: (1) tag: tag refers to a hexadecimal number, shape as 0x000000000. (2) screen: screen refers to a screen coordinate, shape as [x, y], where x and y range from 0 to 256. (3) minimap: minimap refers to a minimap coordinate, shape as [x, y], where x and y range from 0 to 64. For example, when you want to use an action like <Action_Name(tag, screen)>, you should output like <Action_Name(0x100580001, [37, 55])>; when you want to use an action like <Action_Name(screen)>, you should output like <Action_Name([66, 78])>. Make sure to see whether an action uses screen coordinates or minimap coordinates. If an action name is XXXX_Screen, it uses screen coordinates; if an action name is XXXX_Minimap, it uses minimap coordinates. Last Step Actions: Team Colossus-1: <Move_Screen([160, 130])> <Attack_Unit(0x1007c0001)> <Attack_Unit(0x100740001)> You need to confirm whether the previous action finished executing, and based on this, determine whether to continue the old strategy or immediately take other actions. Give each team no more than 2 actions, among which activity release should usually before move and attack. Now, start generating your analysis and actions: """ self.eop = \ """
Analysis:
Decision Process
1.1 Condition Assessment
text- IS_COLOSSUS_HIGH_GROUND = True The Colossus is currently on the high ground, confirmed via the image and coordinates. - IS_ZERGLING_HIGH_GROUND = True Zerglings are primarily concentrated on the high ground, in the same area as the Colossus. - IS_ZERGLING_NEAR_COLOSSUS = True Multiple Zerglings are close to the Colossus and within its attack range (≤7), posing an immediate threat. - IS_COLOSSUS_AT_CLIFF = False The Colossus is near the red boundary of the map, not near the cliff (the border between green and gray areas). - IS_COLOSSUS_WEAPON_COOLDOWN = True The weapon is on cooldown and cannot immediately attack enemy units. - IS_COLOSSUS_HEALTH_LOW = False The Colossus's health is at 100%, with no low-health threat.
1.2 Action Decision Rules
Decision basis:
1. Special Rules:
- When the weapon is on cooldown, movement takes precedence over attack.
- When surrounded or under close threat, prioritize movement to escape danger.
text2. Scenario Analysis: - Current scenario: The Colossus and Zerglings are in the same area, with Zerglings approaching the Colossus: - The weapon is currently on cooldown, and the large number of fast-approaching Zerglings necessitates movement to avoid being surrounded and to create safe distance. - Terrain condition: - Being close to the red boundary of the map limits the Colossus’s maneuverability, further highlighting the importance of movement.
1.3 Judgment Based on Action Decision Rules
Integrated Strategy and Decision Framework
2.1 Strategy Suggestions
textCurrent Strategy: Emergency Retreat to Low Ground - The Colossus and Zerglings are both on the high ground. Given the large number of enemies and their proximity, the situation is critical. - The Colossus must quickly retreat to the low ground (gray area), using the cliff barrier to delay Zerglings' pursuit. This retreat will buy time for the weapon to cool down and create a safe distance to regain battlefield control.
2.2 Movement Decision Analysis
-Enemy Approach Path Prediction
- Zerglings are advancing from the left pathway on the high ground, attempting to encircle the Colossus.
- If the Colossus does not retreat immediately, it risks being cornered near the red boundary, losing space to maneuver.
Movement Target Suggestions
Reason:
2.3 Attack Decision Analysis
text- Target Prioritization: - Once the weapon cooldown ends, prioritize attacking the nearest Zerglings to reduce immediate threats and maximize splash damage. - Attack Target Selection: - The closest Zerglings (e.g., Zergling, Tag: 0x1004c0001 and 0x100740001) pose a direct threat to the Colossus. - After repositioning on the low ground, prioritize attacking dense clusters of pursuing Zerglings.
3 .Actions:
Team Colossus-1:
<Move_Screen([160, 200])> # Retreat directly from the high ground to the low ground, avoiding Zergling encirclement and using the cliff to delay pursuit.
<Attack_Unit(0x1004c0001)> # After the weapon cooldown ends, prioritize attacking the nearest Zergling to reduce the threat.
<Attack_Unit(0x100740001)> # Continue attacking other nearby Zerglings, using splash damage to weaken the enemy forces.
"""
text# Part 2 if self.config.ENABLE_COMMUNICATION: self.sp += \ """ 4.Communication Output If there is Available Communicate Target, you should keep communicating with them by Communication functions. For example, if 'Commander' and 'CombatGroup4' in Available Communicate Target, you can output as: Communications: <MessageTo(Commander, '''xxxxxxxxxx''')> <MessageTo(CombatGroup4, '''xxxxxxxxxx''')> """ self.eip += \ """ Communication: From Commander: Your task is to attack the enemy workers of an enemy base near minimap [48,32]. Intelligence shows that two enemy Queens are located on the minimap [44,32]. Try to avoid being detected by enemy Queens before arriving. Available Communication Tragets: Commander: Protoss military supreme commander. Responsible for making macro decision through communication, and controls nexus for massrecall for tactical objectives. Available Communication Functions: <MessageTo(AgentName, message)> <MessageTo(ChannelName, message)> <ListenTo(ChannelName)> Args explanation: (1)AgentName: refers to a name mentioned in Available Communication Tragets. (2)ChannelName: shape as Channel-i, i refers to an integer. (2)message: any text wrapped between ''' and '''. """ self.eop += \ """ Communications: <MessageTo(Commander, '''Copy that, we have arrived enemy base, and started attack enemy workers''')> """ # Part 3 self.eip += \ f"""
Give each team no more than {self.config.MAX_NUM_ACTIONS} actions.
Now, start generating your analysis and actions:
"""
class CommanderPrompt(BasePrompt): # TODO: Design a prompt specifically for the supreme military commander
def init(self, name, log_id, config):
super(CombatGroupPrompt, self).init()
self.name = name
self.config = config
self.log_id = log_id
# self.sp = ''
# self.eip = ''
# self.eop = ''
class DeveloperPrompt(BasePrompt): # TODO: Design a prompt specifically for the supreme logistics commander
def init(self, name, log_id, config):
super(CombatGroupPrompt, self).init()
self.name = name
self.config = config
self.log_id = log_id
# self.sp = ''
# self.eip = ''
# self.eop = ''
PROTOSS_FACTORY = {
'default': CombatGroupPrompt,
'commander': CommanderPrompt,
'developer': DeveloperPrompt,
}
TERRAN_FACTORY = {}
ZERG_FACTORY = {}
FACTORY = {
'protoss': PROTOSS_FACTORY,
'terran': TERRAN_FACTORY,
'zerg': ZERG_FACTORY,
}
if name == "main":
from llm_pysc2.agents.configs.config import ProtossAgentConfig
textconfig = ProtossAgentConfig() prompt = CombatGroupPrompt('CombatGroup1', log_id=0, config=config) print("--" * 25 + "System Prompt" + "--" * 25) print(prompt.sp) print("--" * 25 + "Example Input Prompt" + "--" * 25) print(prompt.eip) print("--" * 25 + "Example Output Prompt" + "--" * 25) print(prompt.eop)
下面给出分为三部分的回答:
(A) 对你原本提示词(Prompt)的主要内容做一个简要梳理
(B) 针对“巨像打小狗时无需侦查、只用远程攻击、避免被近身”的需求,提出若干简要修改点
(C) 结合多思维链 (Multi-CoT) 思路,给出一些可行的“分链 + 动态调度”的调整方案
从整体上看,你的提示词(CombatGroupPrompt
类里面的 sp
, eip
, eop
)主要分为几大部分:
基础信息(地图/角色/目标)
关键胜利点 (Key Victory Points)
决策过程 (Decision Process)
IS_COLOSSUS_HIGH_GROUND
IS_ZERGLING_HIGH_GROUND
IS_ZERGLING_NEAR
IS_AT_CLIFF
IS_COLOSSUS_WEAPON_COOLDOWN
IS_COLOSSUS_HEALTH_LOW
整合策略与决策框架
输出格式
<Move_Screen([x,y])>
或 <Attack_Unit(0xTAG)>
等动作。总的来说,这套提示词已经对“巨像 vs. 小狗”的场景提供了相对全面的指导、条件判断与示例输出。
你提到:“巨像攻击小狗时没有侦查需求,只要远程攻击就行,不要被小狗靠近”。
在现有 Prompt 框架中,你可以做如下微调:
删除或弱化 “侦查” (Scout) 相关描述
强调“放风筝 (Kiting) + 远程输出”
Key Victory Points
或 Integrated Strategy
中,追加一句“优先保持远距离输出,禁止让小狗进到 7 射程以内”这样的提示。精简后的主要逻辑
<Attack_Unit(xxx)>
;<Move_Screen([...])>
拉开距离。加快攻击节奏
IS_ZERGLING_NEAR = True
时,立刻后撤拉开到“7+SafeMargin”距离,然后武器冷却完毕后再反打。你还提到要结合多思维链 (Multiple Chains of Thought),让巨像在不同状态下选择不同的思考过程。
下面给出一个示例化的思路:
由于你现在只需要“远程攻击 + 不被近身 + 不需侦查”,可以把整个战术流程粗分为两到三条思维链:
进攻链 (Offense Chain)
防守/机动链 (Defense/Mobility Chain)
(可选) 放风筝专用链 (Kiting Chain)
IF
敌人太近 or 武器冷却中 or 我方血量较低 → 调度 “防守/机动链”ELSE
→ 调度 “进攻链”“We have two sub-chains of thought: OffenseChain and MobilityChain. The meta-level will choose which chain to use based on the conditions below:
- If the colossus is within 6 range of any zergling, or the weapon is on cooldown → use MobilityChain.
- Otherwise → use OffenseChain.”
- (Meta-Level) "Conditions triggered: weapon cooldown = true, so we route to MobilityChain"
- (MobilityChain) "We decide to move to [160, 190]"
- Done.
<OffenseChain reasoning> ... </OffenseChain>
<MobilityChain reasoning> ... </MobilityChain>
这样就能更好地把“保持安全距离的持续火力输出”这一需求体现到你的 Prompt 结构中,实现灵活、可控的巨像微操。
请你完整的整理一下你的思路,要求逻辑紧密。
下面给出一个条理清晰、逻辑紧密的思路整理,围绕「不需要侦查、只用远程攻击、避免被小狗近身」以及「结合多思维链 (Multi-CoT)」进行改进的方案。
原提示词分为三大部分:
sp
)
IS_COLOSSUS_HIGH_GROUND
、IS_ZERG_NEAR
等)与决策规则(如 “武器冷却中则移动优先”、“被包围则立刻后退” 等)。eip
)
<Move_Screen([x,y])>
与 <Attack_Unit(tag)>
。eop
)
在原提示词中,少量地方提及了对敌人分布或路径的侦查、侦测信息。若不需要侦查,可进行如下精简:
删除/弱化「侦查」相关段落
强调远程火力与安全距离
放风筝 (Kiting) 机制简化
将巨像的作战逻辑拆分成两到三条相对独立的「思维链 (Chain of Thought)」,例如:
优点:
以下是一个示例性拆分,供参考:
Offense Chain
<Attack_Unit(0xTAG)>
等指令。Mobility Chain
<Move_Screen([x,y])>
等指令。(可选) 「Kiting Chain」
元决策 (Meta Reasoning):每次游戏循环/观测更新时,先做一道判定:
合并后输出:模型在最终输出 Analysis
时,可以先说明“路由到哪条链”以及该链产出的决策,然后再给出具体 <Move_Screen(...)>
或 <Attack_Unit(...)>
。
增加一段关于多思维链/路由的说明
Decision Process
部分,或在“Analysis”开头,让模型知道:
“We have two sub-chains: OffenseChain and MobilityChain. The meta-level decides which chain to invoke based on the following conditions…”
删除或简化「侦查」段落
重申「不让小狗近身」的策略
- 一旦敌方单位距离 <= 7 或有包围倾向,立刻后退。
- 只有在安全距离 (如 ≥ 7) 且武器冷却完毕时,才优先攻击。
通过以上改动,整个 Prompt 系统会更加契合「巨像对付小狗」的核心需求:最大化远程攻击、拒绝侦查、严防近身,并且在多思维链的管理下,AI 能够自适应地切换进攻或防御,从而让巨像有效利用地形与射程优势来完成战术目标。
具体可以怎么修改,请你先给出中文思路。
下面给出一份中文思路,说明如何具体地修改你的提示词,以实现「不需要侦查、只用远程攻击、避免被小狗近身」并结合「多思维链」的效果。为了方便理解,思路会分成四个步骤:去侦查、强化远程攻击与放风筝、引入多思维链提示、以及在提示词中添加/修改内容的示例说明。
定位侦查描述
sp
/ eip
/ eop
)中,若有关于“侦查敌方位置”或“主动探索地图”的段落或注释,将其删除或注释掉。保留对敌人“距离、血量”的读取
在“关键胜利点 (Key Victory Points)”处
在“动作决策规则 (Action Decision Rules)”处
“若 IS_ZERGLING_NEAR = True(例如距离小于 7),则无论武器冷却与否,都优先移动拉开距离,避免被近身。”
补充:放风筝 (Kiting) 逻辑
“当巨像攻击完毕后,若敌人正加速逼近,则立即后退移动,让小狗跟进期间承受更多远程打击;冷却结束后,再次停止移动进行攻击。如此反复,直至小狗数量大幅减少。”
在System Prompt 或 Decision Process 里,可插入这样一段说明(中文):
多思维链说明
我们将巨像的决策拆分成两条思维链:
- OffenseChain(进攻链):处理何时进攻、攻击哪个小狗。
- MobilityChain(机动链):处理何时后退、如何选移动坐标、如何利用高地和悬崖。
当模型在每一步做决策时,需要先进行「元决策 (Meta Decision)」:
- 如果发现小狗距离过近 (≤7) 或武器在冷却,则调用 MobilityChain;
- 如果武器已冷却且距离安全 (>7),则调用 OffenseChain;
- 如有其它特殊情况(血量危急等),可优先移动离开。
最终输出中,请先说明当前调度了哪条思维链,再给出那条链的分析与动作。
如果想让最终输出更「多链可读」,可以让模型在 Analysis
里写:
(1) [Meta Decision] 当前敌人逼近且武器冷却中 → 调用 MobilityChain
(2) [MobilityChain] 通过屏幕坐标判断安全点 → <Move_Screen([...])>
(3) [结束]
或者只是在内部提示里写这段说明,让模型在普通 “Analysis” 里提到:“因此本回合选择机动链 (MobilityChain) 进行后退。”
以下给出一个更“落地”的示例,展示如何直接修改 System Prompt(sp
)的某些段落——其中仅展示关键片段,供你灵活融入到已有框架中:
plaintext【示例:在 sp 里追加或修改的片段(中文大意)】 ... 3. Key Victory Points: - 追加:巨像应当尽量避免小狗进入 7 距离之内。一旦小狗贴近此安全距离,需要立刻后退,以免被围攻。 - 删除:任何“侦查、探索地图”的描述。 - 保留:使用悬崖来快速下地上地,延迟敌人时间。 4. Decision Process 4.1 Condition Assessment Rules: - 保留 IS_COLOSSUS_HIGH_GROUND / IS_COLOSSUS_WEAPON_COOLDOWN / IS_ZERGLING_NEAR 等判断。 - 去掉/弱化与“侦查”相关的判断。 4.2 多思维链(Multi-CoT)机制:(新增描述) - 我们将决策拆分为 OffenseChain 与 MobilityChain。 - 当武器冷却完毕且敌人距离 > 7:优先使用 OffenseChain 进行攻击。 - 当敌人距离 ≤ 7 或武器在冷却:使用 MobilityChain 进行后撤,拉开距离。 - 若血量过低 (IS_COLOSSUS_HEALTH_LOW = True),永远优先 MobilityChain。 - 当进入后撤时,可直接向低地(或高地)的安全点移动,以利用悬崖延迟敌人。 ... 6. Analysis: - 输出格式:你需要先说明 “本回合选择哪条思维链”,然后给出对应分析和动作。 - 例如: 【OffenseChain】目标:对距离最近的小狗发动远程攻击。 【MobilityChain】目标:向安全坐标 [x, y] 移动,并继续保持与敌人的距离。 7. Action Output: - Actions for Team Colossus-1: - 不超过 2 条命令,通常为 Move 或 Attack。 - 提醒:若上一秒刚攻击,这一秒如果武器还在冷却,就先移动。 ...
说明:
通过上述改动,你的提示词就能实现:
下面给出一个分步骤、较为清晰的思路,帮助你在「巨像(Colossus) vs. 小狗(Zergling)」的场景中,区分何时进入攻击链条(Offense Chain) 与 防守链条(Defense / Mobility Chain),并说明各自该如何思考。
可以先设计一个**“元决策 (Meta Decision)”**,它在每次观察环境时,根据当前局势判断要切换到哪条链。下面是几个关键条件,可以放在“元决策”里逐一检查:
敌人距离
武器冷却状态
血量状况
敌人的数量或包围态势
小结:只要任一“防守触发条件”(敌人近、武器在冷却、血量危险、被包围)为真,就进入防守链;否则进入攻击链。你可以在提示词或脚本中显式写出这些条件,让模型或程序先做一遍判断,然后再调用相应思维链。
主要目标:如何最大化火力输出并快速击杀敌人,在保证自身不被近身的前提下进行远程打击。
检测武器冷却
选择攻击目标
评估敌人反应
重复或结束
小结:攻击链的核心是尽量输出伤害,并在每一次攻击动作之后都要检查“敌人是否逼近”,以决定下一步是否还要继续攻击还是转入防守。
主要目标:如何利用走位或高低地形来避免被近身或包围,为下一波远程攻击赢得安全时间与空间。
确认防守触发原因
选择移动方向 / 位置
判断悬崖可利用度
持续监控敌人距离
小结:防守链其实就是走位和拖延的逻辑,避免被近身或包围,让武器冷却完毕后再转入攻击链。
pythondef meta_decision(env_state): # env_state 包含了巨像位置、敌人距离、武器冷却、血量等关键信息 if (env_state.weapon_cooldown > 0) or (env_state.zerg_distance <= 7) or (env_state.colossus_health < 0.5): return "DefenseChain" else: return "OffenseChain"
Offense Chain
<Attack_Unit(tag)>
;Defense Chain
<Move_Screen([x, y])>
;在具体的提示词或脚本中,可以把上述逻辑写成条件判断,让 LLM 先执行 meta_decision,再根据结果生成对应链条的推理与动作。也可以在提示词中显式告诉模型:“如果遇到这些条件就走防守链,否则走攻击链”。
通过以上方法,你就能明确地区分攻击链与防守链,并且让AI在星际争霸 II 的对局中灵活切换,既保证火力输出,又极力避免被大量近战单位近身。
拆分成两个链条是否合适呢?
是否只拆分成两个链条(攻击链与防守链)合适,取决于你的需求与对策略的细化程度。大多数情况下,拆分为两个链条已经可以处理“何时攻击、何时后退”这两个核心问题;但如果你想要更精细、更灵活的策略,也可以进一步拆分。下面分几点来分析:
实现简单、易于维护
覆盖主要需求
减少推理开销
对复杂情境的可扩展性
缺少对“放风筝”或“机动骚扰”等细节的精细管理
难以处理多维度决策
“放风筝链 (Kiting Chain)”
“多单位协同链”
“宏观战术链”
从两条链条起步
视需求再行扩展
注意保持层次分明
给出两条链条具体的思考流程,引导大模型去思考。
下面给出一个更具操作性的方案示例,帮助你在提示词或具体实现中,引导大模型分别**“想”**(思考过程)出:
并说明如何在每个链条中让模型逐步得出决策。
目标:在安全条件下,对敌人进行最大化伤害输出,快速削减对方兵力。
下面是一段伪“思考流程(Chain of Thought)”的结构示例,可在你的提示词中以注释或元指令的方式呈现,让模型知道如何思考、先后顺序是什么。也可以将其“藏”在 System Prompt 等隐含部分中,让模型在生成时自动遵循。
plaintext[Offense Chain Reasoning] 1. 检查进攻条件: - (a) 判断武器是否冷却完毕 (IS_COLOSSUS_WEAPON_COOLDOWN = False)。 - (b) 判断敌人是否不在贴脸范围(敌人距离 > 6 或 > 7)。 2. 若进攻条件未满足: - 如果武器在冷却或敌人已逼近,则终止本链,转入防守链; - (或者告知“元决策”函数,自动切换到防守链)。 3. 若进攻条件成立:选择具体攻击策略 3.1 寻找优先攻击目标: - 根据距离:优先攻击最近的敌人,避免对方靠近后无法及时处理。 - 或根据血量:若有残血小狗,一发可秒杀,则优先清除。 - 或根据密集度:若出现一群小狗聚在一起,巨像溅射攻击能一炮多杀。 3.2 输出指令: - <Attack_Unit(敌人tag)> - 例如:<Attack_Unit(0x1004c0001)> 4. 评估攻击后情况: - 若敌方开始加速向我方逼近距离 < 7;下个帧/下个回合可能要切换到防守链。 - 若敌人仍然处于安全距离、武器冷却又很短,可以继续本链的下一次攻击。 5. 结束/返回: - 若攻击完成或不满足安全条件,随时切到防守链 (Mobility Chain)。 - 若敌人已被显著削减且不再威胁,可以保持进攻链并继续输出。 [End Offense Chain Reasoning]
- 要点:在提示词或脚本中,你可以让模型在进入 OffenseChain时先逐条检查以上思路,然后最终输出
<Attack_Unit(...)>
动作,并在 Analysis 中说明为何攻击哪个目标。
目标:避免被近身或包围,拉开安全距离,利用地形或高低差做机动。
类似地,可以设计一段“思考流程”:
plaintext[Defense Chain Reasoning] 1. 触发防守条件: - (a) 武器处于冷却状态 (IS_COLOSSUS_WEAPON_COOLDOWN = True)。 - (b) 敌人距离 ≤ 7,有被近身或围攻风险。 - (c) 我方血量偏低(< 60%)且敌人不少于 X 个。 2. 选择移动策略: - (a) 检查地图地形:是否有可利用的悬崖 (IS_AT_CLIFF = True or 近距离)? - (b) 是否可以直接下到低地/上到高地,让敌人绕路,从而赢得时间? - (c) 如果在地图右上角,被逼到红色边界,是否需要沿 X 或 Y 轴移动到更安全的位置? 3. 计算移动坐标 / 路径: - (a) 根据当前巨像坐标 [cx, cy] 与敌人群坐标 [ex, ey],反向拉开距离。 - (b) 如有悬崖可用,设置目标点 [cx + dx, cy + dy] 到另一层地形(低地/高地)。 - (c) 若没法上下悬崖,则向地图的另一端移动,保持 >7 格距离。 4. 输出指令: - <Move_Screen([xx, yy])> - 例如:<Move_Screen([160, 200])> 5. 后续检查: - 移动过程中,若武器冷却已经结束并且敌人不在近身距离,可以尝试切回攻击链 (OffenseChain)。 - 若依旧被紧追不放,可重复调用防守链,不断移动直到脱离危险。 [End Defense Chain Reasoning]
- 要点:在防守链中,最关键的是移动方向、安全坐标的选择逻辑,以及何时可以切回进攻链。
- 你可以引导大模型在“Analysis”中写出“我为什么要移动到[160,200],因为可以躲避追击并利用悬崖”。
在提示词中显式引导
“我们有两个子链条:OffenseChain 和 DefenseChain。请先根据武器冷却与敌人距离,决定走哪条链。然后在该链条中,依照以下思考流程进行推理,并最终给出 <Attack_Unit> 或 <Move_Screen> 等动作。”
在输出格式中展示
Analysis
的部分,先声明“此回合我选择了OffenseChain,因为 XX;然后基于OffenseChain的思路,我会优先攻击最近的Zergling……”,最后输出 <Attack_Unit(...)>
。Analysis
中写“我选择了DefenseChain,因为敌人距离太近/武器还在冷却,故先后退到[160,200]……”并输出 <Move_Screen([160,200])>
。“元决策”或“路由判断”
pythonif weapon_cooldown > 0 or zerg_dist <= 7: chain = "DefenseChain" else: chain = "OffenseChain"
假设你在某一帧观察到:
那么元决策判断:由于zerg_dist <= 7
或 weapon_cooldown > 0
,进入防守链 (DefenseChain)。模型可能输出如下结构的结果(伪示例):
plaintextAnalysis: (1) Meta Decision: - weapon_cooldown=0.5, zerg_dist=5 - 满足防守条件:敌人较近 + 武器未就绪 → 切入 DefenseChain (2) [Defense Chain Reasoning]: 2.1 选择移动策略:最近安全坐标为[160, 200] (低地方向),可利用悬崖延迟敌人 2.2 发出移动指令 Actions: Team Colossus-1: <Move_Screen([160, 200])>
反之,如果发现:
就会进入进攻链 (OffenseChain),输出类似:
plaintextAnalysis: (1) Meta Decision: - weapon_cooldown=0, zerg_dist=10 - 满足进攻条件 → 进入 OffenseChain (2) [Offense Chain Reasoning]: 2.1 寻找优先攻击目标:离我最近或最密集的Zergling Tag=0x1234 2.2 发起攻击 Actions: Team Colossus-1: <Attack_Unit(0x1234)>
Analysis
+ Actions
中说明为什么做该行动。通过这样明确的思考流程,就能让大模型在“攻击”与“防守”两个方面都有条理地去分析并输出相应的指令,从而完成较为高效的星际 II 微操逻辑。
攻击链条和防守链条只能选一个吧?这样是不是得在最确定哪条思维链之前确定走攻击链条还是防守链条吧?而且攻击链条和防守链条你期待的输出是什么呢?
下面给出一个更直观的说明,回答你关于**“只能选一个链条吗”、“如何最终确定走攻击还是防守”以及“链条输出期待是什么”**这三个核心问题。
在同一时刻的单次决策(单个行动)里,通常只能执行一种策略——要么是发动攻击、要么是移动防守。这就像人类操作游戏时,每一秒只能要么点一下攻击敌人,要么点一下往后走,不可能同一时间既跑又打(除非游戏里有特别的移动射击机制,但星际2中巨像的正常情况是“要么发起攻击,要么下移动指令”)。
因此,在一次“思维过程”或“一次指令输出”中,通常只能从攻击链条和防守链条中选取一个。这就需要在每一个时刻,先来一个“元决策 (Meta Decision)”或者“路由选择”的步骤,去判断当前更适合“进攻”还是“防守”。
例如,先判断:
这一判断可以在提示词或代码实现里,用简单的 if-else 或逻辑组合就能完成。
一旦确定选择了其中一个链条,后面就“只执行该链条输出的动作”。
你可能会担心:“我是不是先要把攻击链条和防守链条都想一遍,再选一个?” 实际上可以先用一段“元决策逻辑”做筛选:
检查关键条件(距离、武器冷却、血量、是否被包围等),形成一个boolean或多条件判定:
元决策做完后,就知道“我这一步只需要 OffenseChain”或者“我这一步只需要 DefenseChain”。
接下来就按照对应链条的内部思考流程,做出最终动作。
小贴士:
- 你可以把这个元决策写在系统提示里,让大模型“先判定条件,再分支到对应链条”。
- 或者你可以在程序中写好 Python 逻辑(if else),判断完了再给大模型一个说:“请用OffenseChain处理本回合”;这样就不用让大模型自己同时思考两个链条。
主要动作:
<Attack_Unit(tag)>
:对某个敌方单位(以 tag 标识)发动攻击。<Attack_Ground([x,y])>
来对地形/范围进行攻击,但星际2中多数地面单位是对单位发动攻击。决策思路:
Analysis
里说明)。可能的后续:
主要动作:
<Move_Screen([x,y])>
:移动到屏幕坐标 [x,y]
,拉开距离或利用悬崖上下。<UseAbility_Xxx(...)>
,但以移动为主最常见。决策思路:
可能的后续:
只能选一个链条?
何时确定选哪条?
链条输出是什么?
<Attack_Unit(tag)>
类动作,去打最近或最优目标。<Move_Screen([x,y])>
类动作,用移动来防止被近身或包围。这样就能形成一个比较清晰的“进攻-防守”循环:
这两条链条的具体决策逻辑可以放在提示词(Chain of Thought)里,让大模型在不同条件下输出对应的指令;也可以在程序层面先判断,然后告诉大模型“现在只需要某条链”的动作。
攻击链条和防守链条,引导思考的步骤要有吧?判断走哪条链条,如何判断要讲清楚吧。
下面给出一个更完整的思考引导方案,包括:
并示范如何在提示词中进行“元决策”与“链条推理”。
在大模型开始具体推理(给出动作)之前,你需要一个**“元决策”过程,用来判断**“我现在应该进攻还是防守?” 这可以在提示词(System Prompt)或在你的代码逻辑里实现。通常做法是定义几条关键条件:
武器冷却 (Weapon Cooldown)
IS_COLOSSUS_WEAPON_COOLDOWN = True
),无法立即攻击,倾向于防守先拉开距离。敌人距离 (Zergling Distance)
巨像血量 (Health)
被包围风险
元决策简要逻辑(示例):
pythondef meta_decision(env_state): if (env_state.weapon_cooldown > 0 or env_state.zerg_dist <= 7 or env_state.colossus_health < 0.6 or env_state.is_being_surrounded): return "DefenseChain" else: return "OffenseChain"
在提示词里,你也可以用类似自然语言的规则说明:
“若敌人近或武器在冷却或血量低,则选防守链,否则选攻击链。”
下面是一段可嵌入提示词、指导大模型如何在「攻击链条」里进行推理的结构示例。你可以在 System Prompt 中写类似“OffenseChain Reasoning Template”,让模型调用。
plaintext[Offense Chain Reasoning Steps] 1. 检查进攻可行性 - 查看当前武器是否冷却 (IS_COLOSSUS_WEAPON_COOLDOWN = False) - 查看敌人距离是否安全 (enemy distance > 7) - 如果不满足以上条件,说明无法安全输出 → 转去 DefenseChain 2. 选择攻击目标 - 从所有可攻击的敌人中,挑选优先目标: a) 最近的小狗(防止被贴身) b) 血量最少、可一击毙命的小狗(快速减员) c) 或者最密集的敌群(巨像溅射伤害) - 在思考中解释:**为什么**选这个目标? 3. 输出攻击指令 - 格式: `<Attack_Unit(tag)>` - 在 Analysis 中说明:“我决定攻击Tag=XXX,因为它距离最近/血量最低/最危险。” 4. 后续检查 - 攻击后如果发现敌人逼近,可在**下一个时刻**或**下一帧**切换到防守; - 如果仍安全,可继续攻击或等待冷却。 [End OffenseChain Reasoning]
- 期待的输出:通常是一条或两条
<Attack_Unit(0x...)>
指令(因为星际2中一次只能攻击指定单位,不会同时攻击很多目标)。- 在最终回答的“Analysis”里,大模型应当写清楚它是如何根据该思考步骤做出的攻击决策。
同理,可以在提示词中加一个“DefenseChain Reasoning Template”段落,指导大模型如何做移动/撤退逻辑。
plaintext[Defense Chain Reasoning Steps] 1. 确认防守触发 - (a) 武器在冷却 (IS_COLOSSUS_WEAPON_COOLDOWN = True) - (b) 敌人距离 ≤ 7 (被近身风险) - (c) 血量低 / 被包围等其他危险情况 2. 选择移动方向 / 坐标 - 若附近有悬崖,可利用高低差拖延敌人; - 若悬崖较远,则尽量向地图空旷区域或更安全的角落移动; - 基于当前巨像坐标 [cx, cy] 及敌人坐标 [ex, ey],反向拉开距离; - 在思考中说明:**为什么**我要去这个坐标? 3. 输出移动指令 - 格式: `<Move_Screen([x, y])>` - 在 Analysis 中表明:“我选择移动到[x, y]以保持安全距离,并等待武器冷却。” 4. 后续检查 - 移动途中若武器冷却结束、敌人距离重新变远,可在下一个时刻/下一帧切回 OffenseChain; - 如果危险未解除,就继续防守移动。 [End DefenseChain Reasoning]
- 期待的输出:一般是一条
<Move_Screen([xx, yy])>
命令,用来远离敌人或往更安全/更有地形优势的位置走。- 在“Analysis”里,大模型应写明“我为什么移动到这个坐标,以及如何避开敌军”。
你可以在提示词里先写一段“元决策规则”,然后再贴上两个“Chain Reasoning Steps”,最后要求大模型在作答时,先执行元决策 → 决定走哪条链 → 进入对应的思维链 → 输出命令。
示例性 Prompt 结构(仅供参考,伪代码形式):
plaintextSystem Prompt: 1. 元决策说明: - 如果 (weapon_cooldown > 0) or (zerg_dist <= 7) or (colossus_health < 0.6) → DefenseChain - 否则 → OffenseChain 2. OffenseChain Reasoning Steps: (如上所述的 1~4 步) 3. DefenseChain Reasoning Steps: (如上所述的 1~4 步) 4. 输出格式: - 在 [Analysis] 部分先写出“我为什么选择Offense/DefenseChain” - 然后给出动作指令:<Move_Screen(...)>或<Attack_Unit(...)> - 不要超过2条命令 Example Input: (给出敌人坐标、巨像属性、血量、武器CD 等) Example Output: (给出示例:Analysis + Actions) 请开始你的推理:
这样,大模型就会按照先元决策、再进入某条链条的思考模板,并最终输出相应的动作命令。
<Attack_Unit(0xXXXX)>
;<Move_Screen([x, y])>
;通过这种**“元决策 + 2 条链条思维 + 指令输出”**的结构化引导,大模型就能在星际争霸 II 的微操场景中,有条理地做出“进攻或防守”的行动决策,并解释其背后的思路。
因为输出支持执行多个动作,如果是攻击思维链,默认先攻击后移动,如果是防守思维链,默认先移动撤离到安全地区再攻击。
下面给出一个思路示例,支持在一次决策里输出多个动作,但根据“攻击思维链”或“防守思维链”的不同,执行顺序不同:
这可以让大模型在一次决策里给出多步动作(例如先 <Attack_Unit(...)>
再 <Move_Screen(...)>
),满足更灵活的作战需求。
在此基础上,我们仍然需要一个“元决策”来判断当前优先级是进攻还是防守。典型逻辑可以是:
一旦确定了链条,再在该链条里决定具体的“先做什么,后做什么”多步动作。
思路:在比较安全(敌人离得较远或暂时不会贴脸)的情况下,一次决策里可以“先攻击、再根据结果决定是否移动”。
下面是一个可嵌入提示词或思考模板的例子:
plaintext[Offense Chain Reasoning] 1. 进攻条件初步检查 - (a) IS_COLOSSUS_WEAPON_COOLDOWN = False (武器已就绪) - (b) 最近敌人距离 > 7 (暂时不会瞬间被贴身) 2. 首先执行攻击 2.1 选择主要攻击目标 - 最近或最脆的敌人;或利用溅射打最密集点。 2.2 输出指令: <Attack_Unit(tag)> 3. 攻击后补充动作 3.1 检查敌人动向:是否出现快速逼近? - 如果敌人开始冲过来(距离可能会小于 7),为了防止被贴身,可以立即补一个移动指令。 - 若尚在安全距离内,也可不移动。 3.2 若需要移动,则输出: <Move_Screen([x, y])> - 例如向某个稍微后撤的位置走,以避免下一秒被近身。 4. 最终输出 - 最多可输出两条指令:先 Attack_Unit(...),若有必要,再 Move_Screen([x,y])。 - 在“Analysis”中说明为何先打再动:如“武器已冷却且敌人距离还算安全,所以先输出一发攻击;随后担心敌人逼近,补一次后撤。” [End OffenseChain Reasoning]
- 举例:最终可能的动作组合是
<Attack_Unit(0x1004c0001)>
<Move_Screen([160, 190])>
这样就实现了“先打一下敌人,再往后稍微走位”的多步策略。
思路:当武器在冷却或敌人近身时,我们要先保证生存和拉开距离,但也可能在安全一点后顺势打击追过来的敌人。
plaintext[Defense Chain Reasoning] 1. 防守条件初步检查 - (a) 武器在冷却 (IS_COLOSSUS_WEAPON_COOLDOWN = True) 或 - (b) 敌人距离 ≤ 7 或 - (c) 巨像血量偏低等高危情况 2. 先执行移动 / 撤退 2.1 判断移动目标 - 最近的安全坐标 [x,y],或利用悬崖下地形拖延敌人。 2.2 输出指令: <Move_Screen([x, y])> 3. 视情况追加攻击 3.1 检查:移动后是否已拉开到安全距离 > 7? 3.2 如果此时武器冷却也已就绪,且敌人正追过来,但离得还有几格 → 可以趁机打一下(见好就收)。 3.3 若决定攻击,则: <Attack_Unit(tag)> 4. 最终输出 - 最多可输出两条指令:先 Move_Screen(...),若安全且武器已就绪,再 Attack_Unit(...). - 在“Analysis”中解释为什么移动后还能打,如“走到 [160, 190] 稍微与敌人拉开距离,发现武器CD结束,敌人尚未贴脸,所以补一发攻击。” [End DefenseChain Reasoning]
- 举例:输出动作为
<Move_Screen([160, 200])>
<Attack_Unit(0x100700001)>
表示“先后退到 160,200 再顺势打一炮”,这符合“防守链”里的先保安全,再伺机反击的思路。
元决策:
链条内流程:
<Attack_Unit>
,可选再 <Move_Screen>
;<Move_Screen>
,可选再 <Attack_Unit>
。模型输出:
Analysis
里,你可以引导它写:“我当前选择的是进攻链,因为武器冷却就绪且敌人距离 > 7。所以先输出攻击指令,对最近的小狗开火,然后再移动到 [160,190] 以防下一秒被贴近。”这样,就能在星际争霸 II 等 RTS 游戏中,兼容一次决策多步动作的需求,同时保持“进攻链先打后移 / 防守链先移后打”的清晰策略逻辑。
执行结果是没办法检查的,因为通过思维链给出移动指令后,执行完毕就已经进入下一轮了,现在请你完善上述回答。
下面给出一个更完善的思路,用来说明在「一次决策中输出多步动作」的情况下,如何处理无法在当前思维链直接检查执行结果这一现实限制。也就是说,我们发出“先攻击再移动”或“先移动再攻击”的指令后,必须到下一帧(或下一次决策循环)才能知道效果如何,再决定新的行动。
<Move_Screen>
或 <Attack_Unit>
)往往需要实际的时间来完成。<Attack_Unit(...)>
,然后 <Move_Screen(...)>
”,并把这两个动作一并输出。结合前面“先攻后移”/“先移后攻”的思路,并考虑“无法当场检查执行效果”,我们可以在提示词或逻辑中这样写:
元决策:判断当前倾向进攻还是防守
在当前思维链:拟定“一揽子”两步动作
<Attack_Unit(tag)>
<Move_Screen([x,y])>
,若想打一炮再后退。<Move_Screen([x,y])>
<Attack_Unit(tag)>
,如果撤到安全距离后想抓住机会反击。强调:对这两步动作的执行结果,我们要在下一帧(下一次决策)才可能知道
以下是一个更加“就事论事”的提示词结构示例(中文):
plaintextSystem Prompt (示例简化版): 1. 元决策: - 如果 (IS_COLOSSUS_WEAPON_COOLDOWN = True) or (敌人距离 ≤ 7) or (健康 < 60%) → 防守链 (先移动) - 否则 → 攻击链 (先攻击) 2. 攻击思维链 OffenseChain: - 思考流程: (1) 如果武器冷却已好且敌人不近 → 先 <Attack_Unit(...)> (2) 可选: 若想避免下一秒被贴身 → 接一个 <Move_Screen(...)> (3) 无法在本回合确认攻击/移动是否成功,需要在下回合观察实际结果 3. 防守思维链 DefenseChain: - 思考流程: (1) 先 <Move_Screen([x,y])> 远离敌人或找悬崖/安全区 (2) 可选: 若武器冷却已准备、且暂时安全,可再下达 <Attack_Unit(...)> (3) 同样无法在本回合检查这些动作的最终效果,需要下一回合再观察 4. 在一次输出中,可以给出最多2条指令,顺序分别为: - 攻击链: [Attack_Unit, Move_Screen] - 防守链: [Move_Screen, Attack_Unit] (如果第二步不合适,可以省略) 5. 说明: - “执行完毕”要到下个回合才知道具体结果 - 因此在Analysis中,只需说明为何做这两步动作;待下一帧时再根据新的观测来决定下一个动作
在“Analysis”中,你可以让模型写类似:
plaintextAnalysis: - 当前敌人距离=9,武器冷却=0,血量=100% - => 符合进攻链 - 第一步: <Attack_Unit(0x1004c0001)>, 因其最近/血量最低 - 第二步(可选): <Move_Screen([160, 190])>, 以防下一秒敌人靠近我方时没有时间后退 Actions: Team Colossus-1: <Attack_Unit(0x1004c0001)> <Move_Screen([160, 190])>
然后在下一次决策开始时(下一帧),再看“有没有真正打死那只狗、移动位置是否安全”,再重新走元决策。
这样,就能满足你希望的“先攻后移 / 先移后攻”的策略,同时照顾到现实中无法即时检查结果的限制,让整个多思维链的执行与游戏进程自然衔接起来。
攻击链和防守链都应该写的逻辑清晰一点,包括开始起步走哪一条链路,都要说清楚。
下面给出一个更为详细且逻辑清晰的方案,涵盖:
并示范如何在一次决策(一次思维过程)中输出多步动作,如“先攻后移”或“先移后攻”,同时注意到星际争霸 II 环境下“执行结果要到下一帧才能确认”的限制。
在每一次决策循环(或每次游戏帧)开始时,你需要先进行元决策 (Meta Decision),判断当前是进攻优先还是防守优先。可以根据以下条件依序检查:
武器冷却 (Weapon Cooldown)
IS_COLOSSUS_WEAPON_COOLDOWN = True
),无法立即攻击,则倾向于防守/移动,拖时间等冷却完毕。敌人距离 (Distance)
血量状况 (Health)
是否遭包围 (Surrounded)
综合上述条件,可以写成一个简单的 if-else 规则(或自然语言说明):
plaintextIF (weapon_cooldown > 0) OR (zerg_dist <= 7) OR (colossus_health < 0.6) OR (being_surrounded = True) THEN 选择防守链 (Defense Chain) ELSE 选择攻击链 (Offense Chain)
当确定了“攻击链”,表示此刻武器冷却完毕、敌人不在贴脸范围、巨像情况还算良好,可以主动输出伤害。若你允许一次决策里下达多步动作,则可设置如下流程:
IS_COLOSSUS_WEAPON_COOLDOWN = False
,且最近敌人距离 > 7。<Attack_Unit(0xTag)>
<Move_Screen([x, y])>
当确定了“防守链”,表示此刻要优先避免被近身或包围,或者武器在冷却无法打输出,需要通过移动来拖延时间/保命。但也可能在移动后发现局势允许打一炮。此时可以在一次思维过程里先移后攻。
<Move_Screen([x, y])>
<Attack_Unit(0xTag)>
从开始就选链路:
攻击链:先攻后移
<Attack_Unit(...)>
+ <Move_Screen(...)>
,以防下一秒被近身。防守链:先移后攻
<Move_Screen(...)>
;<Attack_Unit(...)>
来反击。无法在同一次决策里检查执行效果
优点
综合来说,你的提示词或实现逻辑中,可以呈现如下结构:
元决策模块:
Offense Chain:
Defense Chain:
Analysis 部分:
Actions 部分:
textTeam Colossus-1: <Attack_Unit(0x1004c0001)> <Move_Screen([160, 200])>
这样就能清晰地呈现从“开始判断链路”到“链路内部多步动作”的完整过程,并且兼顾了星际 II 引擎里“无法立刻检查动作执行结果”的特点,确保整个决策流程合乎逻辑、易于维护。