LLM-PySC2 项目概述
बनाया गया: 8 जनवरी 2025
बनाया गया: 8 जनवरी 2025
LLM-PySC2 is
NKU Robot Autonomy and Human-AI Collaboration Group and
NUDT Laboratory for Big Data and Decision's Python component of the StarCraft II LLM Decision Environment.
It wraps Deepmind's PySC2 Learning Environment API
in to a LLM energized Multi-Agent Decision Environment.
This is a collaboration between NKU and NUDT to develop StarCraft II into a rich environment for LLM research.
LLM-PySC2 provides an interface for LLM agents to interact with StarCraft 2,
getting textual/multimodal observations and dealing with textual actions.
We also make it possible for LLMs to make decisions in SMAC tasks, which make it possible to compare LLM with RL method.
If you use the LLM-PySC2 environment or LLM-SMAC tasks in your research,
please cite our github pager or
LLM StarCraft II Pre-Print Paper after preprint paper published.
You can also contact us by e-mail [email protected] or [email protected].
Arxiv paper: https://arxiv.org/abs/2411.05348 (Citation: https://ui.adsabs.harvard.edu/abs/2024arXiv241105348L/exportcitation)
LLM-PySC2 depends on the full StarCraft II game and only works with versions that
include the API, which is 3.16.1 and above.
Follow Blizzard's documentation to
get the linux version. By default, LLM-PySC2 expects the game to live in
~/StarCraftII/
. You can override this path by setting the SC2PATH
environment variable or creating your own run_config.
Install of the game as normal from Battle.net. Even the
Starter Edition will work.
If you used the default install location LLM-PySC2 should find the latest binary.
If you changed the install location, you might need to set the SC2PATH
environment variable with the correct location.
LLM-PySC2 should work on MacOS and Windows systems running Python 3.8+,
but has only been thoroughly tested on Linux. We welcome suggestions and patches
for better compatibility with other systems.
download the LLM-PySC2 code from our github page LLM-PySC2.
use pip install to initialize the environment:
shell$ conda create --name YOUR_ENV_NAME python==3.9 $ conda activate YOUR_ENV_NAME $ pip install -e .
you can use mirrors like pip install -e . -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
to speed up downloading.
We have placed the required maps in the project folder:
textllm_pysc2/maps/llm_pysc2 llm_pysc2/maps/llm_smac
You need to copy and paste these 2 folders
into the Maps folder of the StarCraft2 program. Generally, the folder path is:
C:\Program Files (x86)\StarCraft II\Maps
and finally looks like:
textC:\Program Files (x86)\StarCraft II\Maps\llm_pysc2 C:\Program Files (x86)\StarCraft II\Maps\llm_smac
If you used a custom path in installation, you may need to find the Map folder to finish the step.
If you do not know how to get api_key, you can contact us to obtain a temporary gpt-3.5-turbo api_key with 2M tokens for free.
You need to write your api_key in ./llm_pysc2/agents/configs/config.ProtossAgentConfig
before test the llm:
textclass ProtossAgentConfig(AgentConfig): def __init__(self): super(ProtossAgentConfig, self).__init__() self.race = 'protoss' self.model_name = 'gpt-3.5-turbo' self.api_base = 'YOUR_API_BASE' self.api_key = 'YOUR_API_KEY' ...
or set api_key like what we do in ./llm_pysc2/bin/experiment_llm_pysc2.py
:
config.reset_llm(model_name, api_base, api_key)
if you do not have api_key but still want to test the environment,
you can set config.LLM_SIMULATION_TIME = 5
to simulate a 5-second response large model
and continue the tutorial below.
After specify your LLM api_key, api_base and model_name, you can run our experiments to test LLM and
both the llm_pysc2 tasks and llm-smac tasks:
shell$ python -m llm_pysc2.bin.experiment_llm_pysc2 $ python -m llm_pysc2.bin.experiment_llm_smac
These two script will load gpt-3.5 energized agents and use pure text observation to make decisions.
If you want to use multimodal LLMs like gpt-4v, you can set config.ENABLE_IMAGE_RGB = True
to
activate image observations.
Also, you can use --parallel parameter (or edit files in ./llm_pysc2/bin) to run several games at the same time:
shell$ python -m pysc2.bin.agent --map pvz_task4_level1 --agent_race protoss --parallel 2 --agent llm_pysc2.bin.experiment_llm_pysc2.MainAgentLLMPysc2 $ python -m pysc2.bin.agent --map pvz_task4_level1 --agent_race protoss --parallel 4 --agent llm_pysc2.bin.experiment_llm_pysc2.MainAgentLLMPysc2
which may significantly improve experimental efficiency.
We provided two series of experiment tasks:
text(1) llm_pysc2 experiments: a series of pvz combat, with 8 different settings and 3 levels of difficult for each. (2) llm_smac experiments: same as original SMAC tasks, units control by LLM agent instead of RL agent.
You can run these experiments in ./llm_pysc2/bin/llm_pysc2
and ./llm_pysc2/bin/llm_smac
.
llm_smac experiments are original smac tasks, use the same map and setting of smac tasks. Consider that
the control of the Zerg and Terran is still ongoing, we will add more smac tasks in the future
(executable files in ./llm_pysc2/bin/llm_smac
):
text2s3z 3s5z 1c3s5z 3s_vs_3z 3s_vs_4z 3s_vs_5z 2c_vs_64zg 2s_vs_1sc 3s5z_vs_3s6z (more to be added in future version)
llm_pysc2 experiments contains 6 pvz combat with 3 level of difficulties, concentrate more attention to larger
scenarios and the use of unit skills, provide higher complexity and operability:
Task | Description | Text-Input | Image-Input | Multi-Agent | Communication |
---|---|---|---|---|---|
task1 | 2 Adept harass zerg base | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task2 | 3 Phoenix harass zerg base | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task3 | 6 Stalker intercept airdrops | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task4 | 12 Stalkers vs many zerg units | ✔ | ✔ (can be disabled) | ✖ | ✖ |
task5 | 2c3d4s12s vs many zerg units | ✔ | ✔ (can be disabled) | ✔ (Isomorphic) | ✖ (can be enabled) |
task6 | 1a6h4s12s vs many zerg units | ✔ | ✔ (can be disabled) | ✔ (Isomorphic) | ✖ (can be enabled) |
task7 | 1m3c3t6v12s vs many zerg units | ✔ | ✔ (can be disabled) | ✔ (Heterogeneous) | ✔ (can be disabled) |
task8 | 2wp8bg12s multiline attack | ✔ | ✔ (can be disabled) | ✔ (Heterogeneous) | ✔ (can be disabled) |
Each of them has three different difficulties. Simply, level-1 can be used for demo and method debug;
level-2 is a standard difficulty, can be used for policy training, and level-3 can be used as an experimental field
for trained method evaluating, which is quite difficult, and can be served as a good OOD evaluating environment.
More details can be seen in ./docs/llm_pysc2/experiments.md
.
If you want to redefine a SubAgent's interaction process with the large model, you can redefine the query function of
a SubAgent. View relevant code of class Customized_LLMAgent(LLMAgent)
in ./llm_pysc2/agents/llm_pysc2_agent.py
:
textclass Customized_LLMAgent(LLMAgent) def query(self, obs) -> None:
Main agent is used to interact with pysc2 and does not directly make decisions, so it is not recommended to modify it.
You can affect the main agent by modifying the config just like what we do in.
textllm_pysc2/bin/experiment_llm_pysc2.py llm_pysc2/bin/experiment_llm_smac.py
MainAgent is only an objects used for scheduling cameras, managing internal data and interacting with the env.
While SubAgents(of the MainAgent) query llm to obtain text actions, plays the role of decision maker.
SubAgent is configured in configurations(llm_pysc2/agents/configs).
For example, ./llm_pysc2/agents/configs/llm_smac/config_2s3z.py
defines a MainAgent with only one sub agent named
CombatGroupSmac
:
textclass ConfigSmac_2s3z(ProtossAgentConfig): def __init__(self): super(ConfigSmac_2s3z, self).__init__() ... self.AGENTS = { 'CombatGroupSmac': { 'describe': "Protoss military commander, controls units to fight against enemy. ", 'llm': {...}, 'team': [ {'name': 'Zealot-1', 'unit_type': [units.Protoss.Zealot], 'game_group': 1, 'select_type': 'group'}, {'name': 'Zealot-2', 'unit_type': [units.Protoss.Zealot], 'game_group': 2, 'select_type': 'group'}, {'name': 'Stalker-1', 'unit_type': [units.Protoss.Stalker], 'game_group': 4, 'select_type': 'group'}, ], 'action': { units.Protoss.Zealot: PROTOSS_BASIC_ACTION_SMAC, units.Protoss.Stalker: PROTOSS_BASIC_ACTION_SMAC, }, }, } ...
you can add more sub agent in your configuration if needed.
You can design agent's unit team in config.
In the example above, we showed a sub agent with 3 teams called 'Zealot-1', 'Zealot-2' and 'Stalker-1'.
You can redefine your agent teams if needed.
A team should consist of:
text'name': any str 'unit_type': list, of pysc2 unit enum 'game_group': int, -1 to 9, -1 refers to do not add units to any game_group 'select_type': str, 'group' for group recall, 'select' for single select(mouse left click), 'select_all_type' for select screen units of same types(mouse double click)
More examples can be viewed in ./llm_pysc2/agents/configs
.
You can design team's action space in config.
In the example(ConfigSmac_2s3z) above, we showed a action space of CombatGroupSmac.
Where PROTOSS_BASIC_ACTION_SMAC
shape as:
[{'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]},]
and PROTOSS_BASIC_ACTION_SMAC2
shape as:
text[{'name': 'Attack_Unit', 'arg': ['tag'], 'func': [(12, F.Attack_screen, ('queued', 'screen_tag'))]}, {'name': 'Move_Screen', 'arg': ['screen'], 'func': [(331, F.Move_screen, ('queued', 'screen'))]},]
Each action should consist of three parts:
text'name': any str 'arg': list, either of [] / ['tag'] / ['screen'] / ['minimap'] / ['tag', 'screen'] / ['tag', 'minimap'] 'func': a list of triplet (pysc2_func_id, pysc2_func, args_type)
More examples can be viewed in ./llm_pysc2/agents/configs
and ./llm_pysc2/lib/llm_actions
.
You can design your own Observation Wrapper by redefining o_translator
in ./llm_pysc2/lib/llm_observations.py
.
You can design your own Text Action Recognizer by redefining a_translator
in ./llm_pysc2/lib/llm_actions.py
.
You can design your own LLM Client in llm_pysc2/lib/llm_client
if needed.
We have planed to try our best to add the following features before 2025/2/1
:
text(1) Support of Zerg control (2) Support of Terran control (3) Compatibility of classic RL algorithms
And more features before 2025/5/1
:
text(1) Full game experiments series1: in map Simple64, Simple96, Simple128 (2) Full game experiments series2: in map Ancient Cistern LE, Babylon LE, Gresvan LE
Note that this LLM-PySC2 is a preview version, and official version with far more features and better stability
will be open-sourced before 2025/7/1
.
Zongyuan Li (Main Contributor, Nankai University):
Framework design. Multi-Agent structure. Text observation generation. Text action recognition.
Communication. Experiments design. Prompt. Logger. Documents. Organize.
Runnan Qi, Yanan Ni, Lumin Jiang (National University of Defense Technology):
LLM Client. Multimodal LLM Client. Image observation generation. Documents.
Chang Lu, Xiaojie Xu, Pengfei Li, Yunzheng Guo, Zhe Ma (Nankai University):
Data recorder. Experiments evaluation. Game knowledge.
Kuihua Huang (National University of Defense Technology), Xian Guo, Xuebo Zhang(Nankai University):
Organize.
LLM-PYSC2: STARCRAFT II LEARNING ENVIRONMENT FOR
LARGE LANGUAGE MODELS ∗†
Zongyuan Li1
, Yanan Ni2
, Runnan Qi2
, Lumin Jiang2
, Chang Lu1
, Xiaojie Xu1
,
Xiangbei Liu1
, Pengfei Li1
, Yunzheng Guo1
, Zhe Ma1
,
Xian Guo1,∗
, Kuihua Huang2,∗
, Xuebo Zhang1,∗
1 College of Artificial Intelligence, Nankai University
2 Laboratory for Big Data and Decision, National University of Defense Technology
ABSTRACT
This paper introduces a new environment LLM-PySC2 (the Large Language Model StarCraft II
Learning Environment), a platform derived from DeepMind’s StarCraft II Learning Environment
that serves to develop Large Language Models (LLMs) based decision-making methodologies. This
environment is the first to offer the complete StarCraft II action space, multi-modal observation
interfaces, and a structured game knowledge database, which are seamlessly connected with various
LLMs to facilitate the research of LLMs-based decision-making. To further support multi-agent
research, we developed an LLM collaborative framework that supports multi-agent concurrent queries
and multi-agent communication. In our experiments, the LLM-PySC2 environment is adapted to
be compatible with the StarCraft Multi-Agent Challenge (SMAC) task group and provided eight
new scenarios focused on macro-decision abilities. We evaluated nine mainstream LLMs in the
experiments, and results show that sufficient parameters are necessary for LLMs to make decisions,
but improving reasoning ability does not directly lead to better decision-making outcomes. Our
findings further indicate the importance of enabling large models to learn autonomously in the
deployment environment through parameter training or train-free learning techniques. Ultimately,
we expect that the LLM-PySC2 environment can promote research on learning methods for LLMs,
helping LLM-based methods better adapt to task scenarios.
1 Introduction
In 2017, the StarCraft II Learning Environment (SC2LE)[1] was developed by DeepMind and Blizzard Entertainment.
It is the first environment that enables various reinforcement learning (RL) agents to compete with each other in the
StarCraft II game, and promoted the emergence of decision-making methods such as QMix[2], Weighted QMIX[3],
MAPPO[4], and the household name AlphaStar[5]. However, RL-trained agents typically require a substantial amount
of data and prolonged interactions, but still lack generalization capabilities in most scenarios due to the task-relevant
reward function. Consequently, it is urgent to develop new decision-making methods at the present.
Additionally, impressive research efforts such as Stanford Town[6], LLM plays MineCraft[7] and the game of
Diplomacy[8] have demonstrated great potential in LLM-based decision-making in recent years. Considering that
large models exhibit greater interactivity, interpretability, and reasoning capabilities, it is quite natural to apply large
models in complex decision-making environments. However, there is no sufficiently comprehensive platform to support
research on LLM decision-making methods in complex environments. Notably, the mainstream platform SC2LE
environment does not yet support research on decision-making with large models.
In order to leverage the advantages of large models and circumvent the disadvantages of RL, researchers developed the
SC2 module into TextStarCraft II (TSC2)[9], enabling LLMs to interact with the StarCraft II environment for the first
time. However, there are some restrictions in the environment. The LLM based agent can not use micro-operations and
∗Corresponding author.
†Code is available at https://github.com/NKAI-Decision-Team/LLM-PySC2
arXiv:2411.05348v1 [cs.AI] 8 Nov 2024
Zongyuan Li et. al.
unit skills to defeat enemy units due to the scale-cropped discrete action space. While observation only contains unit
counts and upgrade status that are not sufficient for the implementation of complex strategies. What’s more, multi-agent
collaboration is not available because TSC2 is a single-agent framework.
To address these issues, we developed LLM-PySC2, an environment derived from SC2LE, based on PySC2 module.
This environment provides agents with comprehensive observations, including global information and agent-specific
local combat information (in text form or multimodal form) and a structured game knowledge database. We also
expanded the action space to the full StarCraft II action space, enabling agents to perform fine-grained operations and
unit skills. To support multi-agent research, we built a multi-agent framework with a communication system that allows
both point-to-point and domain communication.
In experiments, eight new scenarios were proposed. Unlike the SMAC[10] tasks, these tasks emphasize not only
micro-operations but also task understanding and macro-decision abilities. We tested nine mainstream LLMs in both the
SMAC tasks and the new proposed scenarios. The results indicate that pre-trained LLMs have possess decision-making
ability but lack the ability to make consistently effective decisions. Pre-trained LLMs without task-specific training may
be unable to analyze the key elements for achieving victory. They often fail to identify the important part of the game
knowledge for the most times, making mistakes in analysis or even dealing damage to allies sometimes.
In summary, there remains much to be done to raise the ability of LLMs in the domain of multi-agent decision making.
We hope that the LLM-PySC2 environment will advance research on LLM learning techniques, helping LLM-based
methods better adapt to task scenarios.
2 Related Works
2.1 Starcraft II
Starcraft II is a classic platform for evaluating algorithms. Specially, as a real-time strategy game, StarCraft II features a
high-dimensional partially observable state space and a huge continuous action space. With three species and more than
120 types of units, it is widely regarded as one of the most complex and challenging environments and is commonly
used for evaluating advanced decision-making methods.
To support the research of learning methods, DeepMind and Blizzard Entertainment developed SC2LE, a comprehensive
environment for RL research. This environment is designed to improve research in learning algorithms within complex
strategy games. It provides RL interfaces such as the observation, actions, and reward function, considered as one of the
most significant environments in the field of artificial intelligence.
Consequently, after the introduction of SC2LE, more and more StarCraft II environments have emerged. Among
those environments, SMAC[10] and PyMARL are the most famous. SMAC is a benchmark comprising 23 tasks
specifically designed for multi-agent RL, mostly focusing on distributed multi-agent decision-making. To evaluate
MARL algorithms, the SMAC team also developed PyMARL as their training platform. In the PyMARL framework,
over five algorithms are integrated, and the framework is gradually expanded into a multi-environment available RL
platform.
Overall, their work effectively advanced the research on multi-agent learning methods, made significant contributions
to the field of intelligent decision-making, and motivated us to develop an environment for LLM-based methods.
2.2 LLM Decision-Making and Text StarCraft II
In recent years, the decision-making ability of LLMs has started to attract attention. In 2023, an LLM-based agent
called the Ghost in Minecraft achieved 67.5% success in Minecraft’s diamond challenge. After that, Agent-Pro[11], an
LLM agent capable of using strategies like bluffing in Poker, was developed. Additionally, researchers deployed LLM
agents in Werewolf[12], a game with deception and counter-deception through communication, and developed LLM
agents in the game of Diplomacy, a game of collaboration and competition.
These works inspire researchers to develop LLM-based decision-making methods in games. As one of the most
famous real-time strategy games, StarCraft II was first developed into an LLM-interactable environment called TSC2.
This environment enables LLMs to make macro-decisions in StarCraft II and proves that LLMs can make decisions
and defeat build-in bots at level-4 in StarCraft-II. However, TSC2 does not support micro-operations on units and
multi-agent collaboration and faces limitations in observation and action space.
Under these circumstances, we constructed the LLM-PySC2 environment, aiming to solve these problems and provide
a new StarCraft-II environment. We also make our environment compatible with SMAC tasks, facilitating comparisons
with algorithms developed in the StarCraft environment.
2
Zongyuan Li et. al.
3 LLM-PySC2 environment
3.1 Framework
The LLM-PySC2 environment is built on the PySC2 module’s agent level. In Figure 1, the MainAgent plays the role of
controlling the camera, selecting units, collecting observations, and executing actions, while the LLM agent plays the
role of the actual decision maker that observes game situations, analyzes, and gives actions. Each LLM agent connects
to an LLM, getting a text or multimodal observations from a wrapper, querying the LLM in an independent thread, and
finally getting game analysis and actions.
LLMLLM-PySC2
MainAgent
pysc2.lib.actions.Functions.xx
obs
image
obs
agent info
relevant data
latest communication data
Starcraft2(PySC2)
obs1
obs2
obs3
obs4
analysis: xxcommu: xxactions: xx
systemprompt
example prompt
observed infoCommunication:
<MessageTo(Commander, ‘’‘xxxx’‘’)>
<MessageTo(Channel-1, ‘’‘xxxx’‘’)>
obs5
LLM-PySC2 MainAgent Opponent Agent
Build-inbot
RLAgent
AI Arenacamera
Actions:
Team Stalker-1:
<AttackUnit(0x1000a2001)>
<MoveScreen([54, 32])>
𝑡�� ��
𝑡�� ��
LLMAgent
Action Recognizer
��
�� ��
’ ��
’
𝑖� ��
Obs Wrapper
Game Wiki Data
Communication data
LLM Agent1
LLM Agent2
LLM Agent3
LLMLLMFigure 1: LLM-PySC2 framwork. In LLM-PySC2, the original PySC2 observation will transform into a text-form
observation. LLM-generated text action can be recognized and transformed into PySC2 action functions, enabling
LLMs to interact with the StarCraft II environment.
3.1.1 Interact with environment
An interaction step consists of two phases: auxiliary management and decision-making (and it consists of many game
steps). In the auxiliary management phase, no LLM will be involved. The MainAgent will control the PySC2 camera
and finish works like grouping newly trained units and managing idle workers to avoid excessive involvement of large
models in simple and repetitive labour.
Observations for each agent’s unit teams will be collected in the decision-making phase. After all teams’ observations
are collected, the agents use the Observation Wrapper to translate the structured observations into a text observation.
Then, all agents query remote or local LLMs concurrently, waiting until all the agents get the response.
After all agents get the response, they will use the Action Recognizer to detect valid actions and translate the text actions
into a structured form. Then, the MainAgent moves the camera to the same position when collecting observations and
executes each agent’s stored actions. After executing all the actions, the LLM-PySC2 environment will enter the next
interaction step and repeat the work mentioned above.
3.1.2 Multi-agent communication
Considering that LLMs have inherent advantages in interaction, we designed a communication system for the multiagent framework. In the communication system, agents communicate with each other using ’communication actions’, a
kind of text action similar to unit control actions shown in Figure 1.
3
Zongyuan Li et. al.
In the communication system. An LLM agent can send a message to another agent or send information to a channel.
If the message is sent to an agent, only the designated receiver can get the information. If the message is sent to a
channel, all agents that listen to the channel share the information. Through these communication actions, multi-agent
collaboration frameworks such as centralized decision-making and distributed decision-making can be easily built.
3.2 Observation
Observation is indispensable for decision making. Different types of information are necessary for agents with different
tasks. Roughly, we categorize observational information into two types: local observations for micro-level operations
and global observations for macro-level decision-making. These observations can be divided into text and image
observations according to form.
Valid Communicate Target: Agent XXX: ......
Team XXX: ...... Team XXX: ...... Query Message
System Prompt
Example Input
Example Output
Text Observation
Task Description
Game Time
Team unit info
Game Knowledge
Valid Actions
Valid Commu Target
Communication
Game Time: 00:50
Text Observation
Team XXX Info: ...... Team XXX Info: ......
Team Stalker-1 Info: Minimap Pos: ...... Controlled Units: ...... Nearby Ally Units: ...... Nearby Enemy Units: ...... Relevant Knowledge: Protoss.Stalker: ...... Zerg.Queen: ...... Zerg.Drone: ......
Valid Actions: Team Stalker-1: <Stop()>
<No_Operation()>
<Hold_Position()>
<Move_Minimap(minimap)>
<Move_Screen(screen)>
<Attack_Unit(tag)> Communication: From Agent XXX: ...... Tasks: Team Adept-1' task: Kill as much as enemy workers as possible.
Figure 2: Text observation for micro-operation LLM. Text observation is a part of the query message. It contains
many paragraphs, including team unit info, relevant game Knowledge, and valid actions. Semantic information is added
when the observation wrapper processes the original obs object.
3.2.1 Text Observation
Observation Wrapper for micro-operations This wrapper focuses on local observations. It provides detailed
information of controlled unit, nearby ally and nearby enemy unit for an agent. It extracts unit information from PySC2
obs object and the relevant game knowledge from the knowledge base. As shown in Figure 2 The text observation
generated by the wrapper includes game time, unit information, unit knowledge, valid actions, short-term memory,
communication data, and task descriptions. Agents using the wrapper are designed for micro-operations like fighting
with enemy units or constructing buildings in a specific position.
Observation Wrapper for macro-decisions This wrapper focuses on global observations. It provides deployment
information, unit counts, and upgrade status that are similar to the text observation of the TSC2 environment. For the
agent responsible for military deployment, text observation generated from the wrapper is used for supporting overall
strategy. For the agent responsible for development, the generated global observation will make the agent aware of the
current economy and technology situation, supporting the planning of future works of development.
3.2.2 Image Observation
In the complex environment of StarCraft II, relying solely on textual observations may prevent agents from fully
comprehending the battlefield dynamics. To enhance situational awareness, the LLM-PySC2 environment provides
multimodal observation. This feature enables multimodal large models to integrate visual information, leading to a
more accurate understanding of the situation. Figure 3 highlights two primary types of image observations: game image
observation and feature map observation. These visual inputs provide the agent with critical battlefield information,
facilitating tactical analysis and strategy development.
4
Zongyuan Li et. al.
Figure 3: Image observations. LLM-PySC2 directly extracts PySC2’s image observation, including the game image
and the feature map. It enhances these images by incorporating auxiliary lines, which assist LLMs in accurately
determining the coordinates of various positions. These images, interpretable by multimodal LLMs, provide decisionmakers with complex information, such as terrain layout and unit distribution on the map.
Figure 3 demonstrates the game image and feature map, which are directly extracted from the PySC2 interface. These
images are enhanced with auxiliary lines to provide coordinate information for large models. This approach enables
the agent to accurately perceive crucial battlefield elements, such as unit count and distribution, while also conveying
information that is challenging to express through text, such as terrain features and relative spatial relationships.
3.3 Action
In decision-making environments, the concept of "action" is pivotal to enable interactions between the agent and the
environment. In our framework, LLMs engage with the environment through text-based actions, which must adhere to a
specific format to be recognized and converted into PySC2 action functions. The process of processing text action into
PySC2 functions can be seen in Figure 4.
LLM Response
Analysis:
Team Adept-1 is already attacking a nearby Drone and
should continue this focus, while Team AdeptPhase-1 can
target another Drone nearby for maximum efficiency. Both
teams should remain aggressive to disrupt the enemy's
economy. Actions:
Team Adept-1:
<Attack_Unit(0x101340001)>
<Move_Screen([62, 64])>
Team AdeptPhase-1:
<Attack_Unit(0x1012c0001)>
<Move_Screen([66, 61])>
Team Adept-1:
<Attack_Unit(0x101340001)>
<Move_Screen([62, 64])>
Team AdeptPhase-1:
<Attack_Unit(0x1012c0001)>
<Move_Screen([66, 61])>
Text Action Recognize
pysc2.lib.actions.Attack_screen('now', [40, 47])
pysc2.lib.actions.Move_screen('queued', [62, 64])
pysc2.lib.actions.Attack_screen('now', [48, 51])
pysc2.lib.actions.Move_screen('queued', [66, 61])
0x101340001
0x1012c0001
unit tag unit positionGenerate Action Functions
Figure 4: Text action recognition. The default action recognizer recognizes text actions by searching the "Actions"
part in LLM’s response, extracting action names and arguments, searching for corresponding PySC2 functions, and
generating the callback form of the function.
Text Actions These actions are expressed in a syntax that is intuitive and descriptive, allowing the LLM to comprehend
the intended operation without additional context. A standard text action is encapsulated in angle brackets and several
arguments, shaped as <ActionName()>, <ActionName(arg0)>, or <ActionName(arg0, arg1)>. The arguments can
represent various elements, such as a unit tag, a screen coordinate, or a minimap position, allowing these actions to
encompass the complete continuous action space of PySC2.
5
Zongyuan Li et. al.
In the decision-making phase, LLM will be informed of currently available actions, such as <Attack_Unit(tag)>,
<Move_Screen(screen)> and <Select_Unit_Attack_Unit(tag, tag)>. The LLM can generate actions like <Attack_Unit(0x100030001)> or <Move_Screen([23, 37])> according to observed information and its purpose. If LLM
generated multiple text actions, the first action will be executed immediately, and the remaining actions will be added to
the action sequence waiting for execution.
Action Space All kinds of actions in PySC2 are available in our environment, however, each agent does not have to
face all the actions of its race. In our environment, the action space is agent-specific, allowing each agent to define a
unique set of actions. For the agent that controls units such as Stalkers, the action space consists of text actions like
<Stop()>, <No_Op()>, <Move_Screen(screen)>, <Move_Minimap(minimap)>, <Attack_Unit(tag)>, and do not consist
of actions like training units or research.
4 Experiments
4.1 Experiment Scenarios
To facilitate research in LLM-based decision-making, we have provided two sets of experiments: LLM-SMAC tasks and
LLM-PySC2 tasks. The LLM-SMAC tasks are the same as standard SMAC experiments, which serve as an excellent
bridge for comparing with RL-based methods. LLM-PySC2 tasks are new scenarios, which, compared to the SMAC
tasks designed specifically for micro-operations, place more emphasis on the large model’s ability to understand the
task scenario and make macro-level decisions.
4.1.1 LLM-SMAC tasks
LLM-SMAC tasks share the same settings as the original SMAC tasks. These tasks initialize units for both sides
and automatically raise attacks for enemy units. In these scenarios, the key to victory lies in concentrating firepower,
controlling combat distance, and, sometimes, in interaction frequency. They are good scenarios for comparing the
training data efficiency with RL-based methods but not good scenarios for utilizing the multitasking and macro
decision-making capabilities of LLMs.
4.1.2 LLM-PySC2 tasks
LLM-PySC2 tasks are the newly proposed experiment scenarios, a task group that tests agents’ situation analysis
capabilities, planning abilities, the application of knowledge, communication, and collaboration. Some of the tasks are
shown in Figure 5
(d) Task4: Mid scale combat (e) Task5: Large scale combat (type1)
(a) Task1: 2 Adept harass (b) Task2: 3 Pheonix harass (b) Task3: Intercept enemyairdrops
(f) Task6: Large scale combat (type2)
Figure 5: LLM-PySC task group. The LLM-PySC task group contains eight tasks, with three difficulties for each
task. Compared to SMAC tasks, they place more emphasis on macro decision-making, situation analysis, and skill
use. These scenarios are common in professional competitions. Winning these small tasks is beneficial for winning
complete games in the future.
6
Zongyuan Li et. al.
In these tasks, LLMs need to plan an infiltration route into the enemy base and kill enemy workers, or use unit skills to
implement specific tactics in a battle. In addition, these tasks are more suitable for researching multi-agent collaboration
methods, and implementing centralized or distributed decision-making for LLMs.
There are eight tasks in the LLM-PySC2 task group. Half of the task scenarios are single-agent decision-making
scenarios (from tasks 1 to 4), where one LLM agent controls multiple units, while the other half (from tasks 5 to 8)
tests the cooperation between agents, with multiple agents controlling multiple units with different tactical roles. In
LLM-PySC2 task group, image observation and multi-agent communication are available and can be easily disabled if
needed.
To avoid the situation where methods in SMAC can always reach the 100% winning rate of most tasks, we set three
different difficulty levels for our experiment group. From level 1 to level 3, the forces of the enemy gradually increase.
At a higher level, more units or upgrades will be added to the enemy side, ensuring these tasks can still be effective
even after the LLM-based decision-making methods have been well developed.
4.2 Experiment Results
To facilitate subsequent research, we tested the decision-making ability of various large models. All experiments were
conducted in StarCraft II of Version 5.0.13 (92440), LLM-PySC2 v0.1. We recorded a ratio of resources of the killed
unit over the dead unit (K/D rate) and the winning rate (WR, i.e. task completion rate). The combination of K/D rate
and WR reflects the performance of LLM in decision-making scenarios.
In the LLM-PySC2 environment, we provide series of LLMs, such as GPT-3, GPT-4, GPT-o1, GLM-4, Claude-3,
Llama-3.1. We tested some representative models among them, tested the performance of models with different
reasoning abilities in decision-making tasks (GPT-3.5, GPT-4o-mini, GPT-4o), and tested the performance of models
with different parameters based on the same architecture (Llama3.1-8b, Llama3.1-70b, Llama3.1-405b).
All experiments use the default configuration of the open-sourced codes. As a benchmark, we do not specially design
prompts to promote decision quality or instruct the LLMs to obtain victories, and all the LLMs are not fine-tuned in the
LLM-PySC2 environment. Results show that large models can make decisions and generate text actions in the correct
form. However, when the task is complex enough or requires a lot of micro-operations, large models may not perform
well, suggesting that training or other technical methods are necessary for improving their decision quality.
4.2.1 Experiment Results in LLM-SMAC tasks
In the LLM-SMAC tasks, we conducted 20 repeated experiments for 6 LLMs in each scenario. For scenarios where
decisions were made by GPT-3.5-turbo, we raised the number to 50 due to its good concurrency support and friendly
cost. In these experiments, all large models used textual observations. This setting is completely sufficient for scenarios
other than 2c_vs_64zg, as they basically did not need to utilize terrain information. Results are shown in Table 1.
Table 1: Kill/Death Rates and Winning Rates of LLMs in LLM-SMAC Tasks.
Model Name 2s3z 3s5z 1c3s5z 3s5z_vs_3s6z 2s_vs_1sc 2c_vs_64zg 3s_vs_3z
Gpt-3.5-turbo 0.60 (22%) 0.43 (4%) 0.91 (44%) 0.29 (0%) 0.01 (2%) 0.52 (0%) 0.05 (0%)
Gpt-4o-mini 0.66 (20%) 0.39 (0%) 1.01 (50%) 0.29 (0%) 0.00 (0%) 0.54(0%) 0.09 (0%)
Gpt-4o 0.76 (20%) 0.47 (0%) 0.80 (30%) 0.35 (0%) 0.00 (0%) 0.56 (0%) 0.15 (0%)
Claude3-haiku 0.58 (5%) 0.48 (0%) 0.48 (0%) 0.32 (0%) 0.00 (0%) 0.52 (0%) 0.10 (0%)
Llama3.1-8b 0.19 (0%) 0.23 (0%) 0.18 (0%) 0.14 (0%) 0.00 (0%) 0.49 (0%) 0.00 (0%)
Glm-4-plus 0.81(25%) 0.46 (0%) 0.47 (0%) 0.33 (0%) 0.00 (0%) 0.54 (5%) 0.15 (0%)
We found that, although large models can analyze the observation information and output actions in the correct form,
they performed poorly in SMAC tasks. On the one hand, due to LLM hallucinations and the lack of task-specific
knowledge, they can not deduce the principle that concentrated fire is the key to victory. On the other hand, even if
the observation provided the knowledge that Zealots have a higher attack efficiency than Stalkers, the large models
sometimes still chose to attack the enemy Stalkers first in tasks like 2s3z and 3s5z.
7
Zongyuan Li et. al.
4.2.2 Experiment Results in LLM-PySC2 tasks
The same as the experiments in SMAC, we conducted 20 repeated experiments for each large model in each scenario
and 50 for GPT-3.5-turbo. Considering that all models cannot complete the multi-line attack in Task 8, we only listed
the data from Task 1 to Task 7. Results are shown in Table 2 and 3.
Table 2: Kill/Death Rates and Winning Rates of Gpt-3.5-turbo in LLM-PySC2 Tasks (level-1/2/3).
Task level task1 task2 task3 task4 task5 task6 task7
task-level-1 1.23 (58%) 0.13 (4%) 6.63 (38%) 0.38 (0%) 0.61 (8%) 0.28 (0%) 1.29 (72%)
task-level-2 0.56 (5%) 0.04 (0%) 3.31 (5%) 0.34 (0%) 0.52 (0%) 0.20 (0%) 0.98 (25%)
task-level-3 0.39 (0%) 0.05 (0%) 1.99 (0%) 0.31 (0%) 0.40 (0%) 0.26 (0%) 0.62 (0%)
In table 2, we tested Gpt-3.5-turbo’s performance in all the levels of each task. These data can serve as benchmark values
for future research. These three levels of difficulty not only serve as validation scenarios for developed decision-making
methods in the future but can also be applied to out-of-distribution (OOD) tasks, such as training on level 2 and
validating on level 3.
Table 3: Kill/Death Rates and Winning Rates of LLMs in LLM-PySC2 Tasks (level-1).
Model Name task1 task2 task3 task4 task5 task6 task7
Gpt-3.5-turbo 1.23 (58%) 0.13 (4%) 6.63 (38%) 0.38 (0%) 0.61 (8%) 0.28 (0%) 1.29 (72%)
Gpt-4o-mini 1.67 (70%) 0.16 (0%) 3.46 (0%) 0.39 (0%) 0.62 (20%) 0.30 (0%) 1.02 (40%)
Gpt-4o 2.27 (80%) 0.16 (10%) Inf (100%) 0.46 (0%) TBD TBD TBD
Gpt-o1-mini 1.36 (60%) 0.04 (0%) TBD TBD TBD TBD TBD
Claude3-haiku 2.19 (90%) 0.19 (10%) 5.25 (40%) 0.34 (0%) 0.75 (25%) 0.33 (0%) 0.93 (45%)
Llama3.1-8b 0.28 (5%) 0.12 (5%) 14.9 (75%) 0.18 (0%) 0.48 (5%) 0.14 (0%) 0.71 (25%)
Llama3.1-70b 0.36 (15%) 0.14 (0%) 58.9 (95%) 0.33 (0%) 0.59 (15%) 0.31 (0%) 0.71 (30%)
Llama3.1-405b 0.70 (30%) 0.10 (0%) 3.0k(100%) 0.28 (0%) 0.56 (10%) 0.32 (0%) 0.47 (15%)
Glm-4-plus 0.78 (30%) 0.21 (5%) 153 (100%) 0.38 (0%) 0.60 (10%) 0.30 (0%) 1.03 (55%)
Based on the data presented in Table 3, two conclusions can be extrapolated. First, adequate parameters for the large
model are necessary for decision-making. Llama-3.1-8b, the model with minimum parameters, performs nearly the
worst among all the models we tested, while the 70b and 405b models perform better than the 8b model. Second,
improving reasoning ability does not lead to a linear improvement in decision-making ability. Although GPT-4o
performed the best in most experiments, it still had a zero winning rate in some tasks that can easily be completed,
such as task 4. These results lead to a conclusion: pre-trained large models cannot directly undertake complex
decision-making tasks, and learning in deployment scenarios is almost inevitable.
5 Discussion
In the experiments, we found that there are several deficiencies in LLM-based decision making.
Hallucinations. Hallucination is the first problem that leads to bad decisions. Sometimes, LLMs confuse screen
coordinates with minimap coordinates (as shown in Figure 6), or use unmentioned actions in the Valid Actions part
of the observation. Sometimes, LLMs even damage teammate units. Hallucination has become an urgent problem in
LLM-based decision making.
Poor knowledge utilization. Large models generally exhibit a significant deficiency in leveraging game-related
knowledge. In task 2, game knowledge shows that Phoenix’s GravitonBeam ability will prevent the unit from moving
and attacking. However, this ability is still overused, failing to obtain victories in task 2. In task 5, even the LLM knows
the PurificationNova of Disruptor deals a lot of damage, they use the skill on injured units, causing a large amount of
spillover damage.
8
Zongyuan Li et. al.
Game Info: ...... Team HighTemplar-1 Info:
Team minimap position: [23, 31]
Controlled Team Units:
Unit: HighTemplar Tag: 0x100240001 ScreenPos: [66, 69] ...... Unit: HighTemplar Tag: 0x101dc0001 ScreenPos: [66, 64] ......
......
...... Tasks: Team HighTemplar-1' task: Go to minimap coordinate [32, 32]. Now, start generating your analysis and actions:
Analysis:
Team HighTemplar-1 consists of several High Templars, each with full health andenergy. Our task is to move to the specified minimap coordinate [32, 32]. Beforeproceeding, we should prepare to engage any nearby enemy units with a Psionic
Storm, if they are in range. However, we currently have no information about nearby
enemy units, so we'll focus on reaching the destination while maintaining our ability toengage if necessary. Actions:
Team HighTemplar-1:
<Ability_PsiStorm_Screen([32, 32])>
<Move_Minimap([32, 32])>
<No_Operation()>
screen [32, 32] minimap [32, 32]
<Ability_PsiStorm_Screen([32, 32])>
Query
LLM��
��
Figure 6: LLM hallucination in decision-making. This is an example that LLM confuse the screen coordinates and
minimap coordinates, wasting a PsiStrom skill and 75 energy. This is just an instance of the hallucination of large
models. In fact, there are many other forms of expression, such as attacking teammates and incorrectly choosing priority
targets.
Poor understanding of the world. Lack of world understanding is a kind of lack of knowledge. Pre-trained LLMs are
generally not trained in decision-making tasks. They do not know how to win in each task. In task 4, for example, the
large model should use Stalker’s Blink ability to transfer injured units to the rear. However, this ability is rarely used,
resulting in the unit’s death and a zero winning rate in task 4, even though the LLM is told that Blink is commonly used
to pursue the enemy or retreat injured units.
Low quality collaboration. In the multi-agent tasks like task 5 to 8, LLM agents should collaborate with others and
defeat the enemy together. However, we found it difficult for these agents to reasonably allocate targets, coordinate
attack timing, and coordinate retreat timing, no matter whether they collaborate with or without a leadership/commander.
How to improve the collaboration performance of LLM agents is important in building a high-level multi-agent
decision-making system.
These issues hinder the application of LLMs in decision-making scenarios. Fortunately, there are many ways to improve
the decision-making ability of large models. For example, directly providing knowledge to LLMs may directly improve
their ability. However, providing LLMs knowledge or precisely annotated datasets usually demands quite a lot of
resources. Self-supervised learning is still the most attractive way to enhance decision-making ability, either through
reward-based or reward-free methods (such as LLM reflection), and either through parameter training or training-free
techniques.
6 Conclusion
In this paper, we introduce a new environment for LLM decision-making, the first environment that accommodates
continuous PySC2 actions, and the first LLM StarCraft II environment with a multi-agent framework and communication
system. In experiments, we test mainstream LLMs’ performance in both the LLM-SMAC and LLM-PySC2 task groups,
among which the LLM-PySC2 task group is a brand-new experimental scenario that we designed for large models.
Results of baseline tests show that LLMs can make decisions, generating actions in the correct form. Still, the decision
quality is relatively low and there are several problems like hallucinations, poor utilization of game knowledge, and
lack of world understanding. Results indicate that learning in the deployment environment is necessary for LLM-based
decision-making. We hope the LLM-PySC2 environment can promote research on LLM learning methods, helping
LLM-based decision-making methods better adapt to task scenarios.
9
Zongyuan Li et. al.
References
[1] Oriol Vinyals, Tim Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, and
Demis Hassabis. Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782,
2017.
[2] Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, and Shimon
Whiteson. QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. arXiv
e-prints, page arXiv:1803.11485, March 2018.
[3] Tabish Rashid, Gregory Farquhar, Bei Peng, and Shimon Whiteson. Weighted QMIX: Expanding Monotonic Value
Function Factorisation for Deep Multi-Agent Reinforcement Learning. arXiv e-prints, page arXiv:2006.10800,
June 2020.
[4] Chao Yu, Akash Velu, Eugene Vinitsky, Jiaxuan Gao, Yu Wang, Alexandre Bayen, and Yi Wu. The Surprising
Effectiveness of PPO in Cooperative, Multi-Agent Games. arXiv e-prints, page arXiv:2103.01955, March 2021.
[5] Oriol Vinyals, Igor Babuschkin, Wojciech M. Czarnecki, Michaël Mathieu, Andrew Dudzik, Junyoung Chung,
David Choi, Richard Powell, Timo Ewalds, Petko Georgiev, Junhyuk Oh, Dan Horgan, Manuel Kroiss, Ivo
Danihelka, Aja Huang, L. Sifre, Trevor Cai, John P. Agapiou, Max Jaderberg, Alexander Sasha Vezhnevets,
Rémi Leblond, Tobias Pohlen, Valentin Dalibard, David Budden, Yury Sulsky, James Molloy, Tom Le Paine,
Caglar Gulcehre, Ziyun Wang, Tobias Pfaff, Yuhuai Wu, Roman Ring, Dani Yogatama, Dario Wünsch, Katrina
McKinney, Oliver Smith, Tom Schaul, Timothy P. Lillicrap, Koray Kavukcuoglu, Demis Hassabis, Chris Apps,
and David Silver. Grandmaster level in starcraft ii using multi-agent reinforcement learning. Nature, 575:350 –
354, 2019.
[6] Joon Sung Park, Joseph C. O’Brien, Carrie J. Cai, Meredith Ringel Morris, Percy Liang, and Michael S. Bernstein.
Generative Agents: Interactive Simulacra of Human Behavior. arXiv e-prints, page arXiv:2304.03442, April
2023.
[7] Xizhou Zhu, Yuntao Chen, Hao Tian, Chenxin Tao, Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu,
Xiaogang Wang, et al. Ghost in the minecraft: Generally capable agents for open-world environments via large
language models with text-based knowledge and memory. arXiv preprint arXiv:2305.17144, 2023.
[8] Meta Fundamental AI Research Diplomacy Team (FAIR)†, Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele
Farina, Colin Flaherty, Daniel Fried, Andrew Goff, Jonathan Gray, Hengyuan Hu, et al. Human-level play in the
game of diplomacy by combining language models with strategic reasoning. Science, 378(6624):1067–1074,
2022.
[9] Anonymous. Large language models play starcraft ii: Benchmarks and a chain of summarization approach. arXiv
preprint arXiv:2312.11865, 2023. Accessed: 2024-10-21.
[10] Mikayel Samvelyan, Tabish Rashid, Christian Schroeder de Witt, Gregory Farquhar, Nantas Nardelli, Tim GJ
Rudner, Chia-Man Hung, Philip HS Torr, Jakob Foerster, and Shimon Whiteson. The starcraft multi-agent
challenge. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems
(AAMAS), pages 2186–2188, 2019.
[11] Wenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen, Guiyang Hou, Zeqi Tan, Peng Li, Yueting
Zhuang, and Weiming Lu. Agent-pro: Learning to evolve via policy-level reflection and optimization. arXiv
preprint arXiv:2402.17574, 2024.
[12] Yuzhuang Xu, Shuo Wang, Peng Li, Fuwen Luo, Xiaolong Wang, Weidong Liu, and Yang Liu. Exploring large
language models for communication games: An empirical study on werewolf. arXiv preprint arXiv:2309.04658,
2023.
10
现在请你结合论文内容,理解这个项目
LLM-PySC2 是一个基于 Python 的 StarCraft II 学习环境,旨在将大型语言模型(LLMs)集成到 StarCraft II 的多智能体决策环境中。该项目由南开大学机器人自治与人机协作团队(NKU Robot Autonomy and Human-AI Collaboration Group)与国防科技大学大数据与决策实验室(NUDT Laboratory for Big Data and Decision)合作开发。以下是对该项目的详细理解:
实验任务:
实验结果:
短期目标(截至 2025 年 2 月):
长期目标(截至 2025 年 5 月):
正式版本计划:计划在 2025 年 7 月前开源更稳定、功能更完善的正式版本。
LLM-PySC2 作为一个新型的环境平台,填补了将大型语言模型应用于复杂多智能体决策环境的空白。尽管当前 LLM 在决策质量和协作能力上存在不足,但通过进一步的训练和优化,有望显著提升其在 StarCraft II 等复杂环境中的表现。项目团队希望通过开源和社区合作,推动基于 LLM 的决策方法研究,促进其在实际任务场景中的适应性和效果。
项目由南开大学和国防科技大学的多位研究人员共同开发,涵盖框架设计、LLM 客户端开发、数据记录、实验评估等多个方面。
总结:LLM-PySC2 是一个创新的项目,旨在将大型语言模型应用于 StarCraft II 的复杂决策环境中,通过多智能体协作和丰富的实验任务,探索 LLM 在实时策略游戏中的潜力与挑战。尽管当前存在一些技术难题,但其开源和持续优化的计划为未来的研究和应用提供了广阔的空间。
我在这个环境上做了Visual-CoT的工作,请你理解这个工作。
Visual-CoT: A Visual Guided Chain of Thought Framework
for Fine-grained Control in StarCraft II Actions
Yanan Nia∗
, Runnan Qia
, Zongyuan Lib
, Kuihua Huanga
, Lumin Jianga
, Xiaojie Xub
,
Guo Xianb
, and Xuebo Zhangb
aLaboratory for Big Data and Decision, National University of Defense Technology, Changsha,
China
bCollege of Artificial Intelligence, Nankai University, Tianjin, China
ABSTRACT
In recent years, large language models (LLMs) have made significant progress in the field of natural language
processing. However, fine-grained action control in complex environments remains a significant challenge. To
address this issue, we propose a framework called Visual-CoT (Visual guided Chain of Thought), which aims to
enhance the ability of LLMs to perform fine-grained action control in complex strategy games such as StarCraft
II. This framework integrates visual information with the chain of thought, allowing visual data to seamlessly
interact with the reasoning and decision-making processes of LLMs. We implement Visual-CoT on the LLMPySC2 environment (the Large Language Model StarCraft II Learning Environment) and validate it on a map
featuring high-ground terrain information, simulating a scenario where one Colossus faces thirty-two Zerglings.
Experimental results show that the Visual-CoT framework improves the LLM’s win rate to 45% (winning 9 out
of 20 games), significantly enhancing fine-grained control and precise action execution capabilities. The code is
open-sourced and available at https://github.com/Flycat-Tom/Visual-COT.
Keywords: large language models, Visual-CoT, fine-grained action control, StarCraft II, strategy games
Visual-CoT: A Visual Guided Chain of Thought Framework for Fine-grained Control in StarCraft II Actions 是一个旨在提升大型语言模型(LLMs)在复杂策略游戏(如《星际争霸 II》)中进行精细化动作控制能力的框架。该项目在 LLM-PySC2 环境上实现,结合视觉信息与思维链(Chain of Thought, CoT)方法,显著增强了 LLM 在实时决策和策略执行中的表现。
提出 Visual-CoT 框架:
在 LLM-PySC2 环境上的验证:
显著的性能提升:
系统提示(System Prompt):
示例输入和输出提示:
视觉信息提取:
地图与场景设计:
视觉信息提取:
提示生成:
思维链生成过程:
精细化动作控制策略:
实验设置:
评估指标:
实验结果:
条件判断准确性分析:
结论:
未来工作:
Visual-CoT 框架通过将视觉信息与思维链推理相结合,显著提升了大型语言模型在《星际争霸 II》等复杂策略游戏中的动作控制和决策能力。通过在 LLM-PySC2 环境中的验证,证明了其在战场环境理解和精细化动作执行中的有效性。尽管当前已经取得了显著的成果,但未来仍需进一步验证其泛化能力、扩展至多智能体决策、集成更先进的视觉技术以及探索跨领域应用,以推动基于 LLM 的复杂决策方法的发展。
我给出你这个思维链具体的内容:# Copyright 2024, LLM-PySC2 Contributors. All Rights Reserved.
class BasePrompt:
textdef __init__(self): self.sp = '' self.eip = '' self.eop = '' self.screen_img_rgb_prompt = '' self.screen_img_fea_prompt = '' self.minimap_img_rgb_prompt = '' self.minimap_img_fea_prompt = ''
class CombatGroupPrompt(BasePrompt):
textdef __init__(self, name, log_id, config): super(CombatGroupPrompt, self).__init__() self.name = name self.config = config self.log_id = log_id # Part 1 self.sp = f""" 1. Role Description You are a {self.config.AGENTS[self.name]['describe']}. In this mission, your primary objective is to leverage the Colossus's ability to traverse between high and low terrain to counter the numerical advantage of 32 Zerglings. Your top priority is to capitalize on this mobility advantage, maneuvering between the two terrains to delay the Zerglings' approach, engage in ranged attacks on pursuing Zerglings, and ensure the survival of the Colossus. 2.Map Information 2.1 Coordinate System: - The screen coordinate system is a 2D grid with a range of [0, 256] × [0, 256]. - Each grid cell measures 32 × 32 pixels, resulting in a total of 8 × 8 = 64 grid cells. - The origin point (0, 0) is at the top-left corner, while (256, 256) is at the bottom-right corner. 2.2 Terrain Information: - Pathway (connecting the high ground and low ground): The map consists of an upper platform (high ground) and a lower platform (low ground), connected by a narrow pathway on the left side. - High Ground (light green area): The high ground is characterized by light green grass textures and is located on the upper platform. - Low Ground (dark gray area): The low ground is characterized by dark gray concrete textures and is located on the lower platform. - Impassable Area (black background): The edges of the map consist of a black background, which is impassable. The Colossus must avoid entering these areas. - Cliff (light gray vertical surface): The cliff separates the high ground and low ground. The Colossus can freely traverse the cliff, but Zerglings can only access the high ground through the narrow pathway on the left. Only the boundary between the high ground and low ground is considered a cliff; other deep red boundaries are impassable edges of the map. - Boundary: The boundaries between the high ground and the black background, as well as between the low ground and the black background, are considered impassable edges of the map. The boundaries are represented by deep red lines on the map. 3. Key Victory Points: 3.1 Maintain Maximum Attack Range: - Ideally, the Colossus should stay within its maximum attack range (range 7) to fully utilize its long-range attack advantage while minimizing the risk of melee attacks from enemies. 3.2 Focus Fire on Dense Enemy Clusters: - Target the center of the densest enemy groups to maximize splash damage and quickly reduce enemy numbers. 3.3 Cliff Barrier Delay Effect: The Cliff Barrier Delay Effect refers to leveraging the terrain characteristics of cliffs, where enemy units cannot directly cross the cliff, while allied units can freely traverse it. This forces the enemy to take a detour when transitioning between high and low ground, buying time for allied units to reposition, cool down weapons, and strike effectively. - When enemies are heavily concentrated on the high ground: - If allied units are also on the high ground but risk being surrounded, it is recommended to retreat to the bottom-right low ground. This utilizes the cliff as a barrier to delay enemy pursuit, enabling ranged attacks on the enemies while maintaining battlefield control. - When enemies are heavily concentrated on the low ground: - If allied units are also on the low ground but risk being surrounded, it is recommended to retreat to the top-right high ground. The cliff barrier forces enemies to take a detour, providing time for allied units to reposition and counterattack effectively.
4. Decision Process
Based on the observed data of the Colossus and Zerglings, as well as the RGB image of the screen, use the following decision rules to determine action priorities:
text4.1 Condition Assessment Rules: - IS_HIGH_GROUND: - Criteria: If the Colossus is on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by verifying the Colossus's position in the green area via RGB image and observation data. - IS_ZERG_HIGH_GROUND: - Criteria: If the Zerglings are on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by analyzing the Zerglings' positions via RGB image. - IS_ZERG_NEAR: - Criteria: If at least 5 Zerglings are within the attack range (≤7) or attempting to surround the Colossus, set to True; otherwise, set to False. - Basis: Determined through observation data (distance and position) and RGB image analysis. - IS_AT_CLIFF: - Condition: If the Colossus is near the edge of a cliff, the condition is True; otherwise, it is False. It is important to distinguish between a cliff and a boundary. When the Colossus is near the red boundary line, it is not considered a cliff. A cliff is the transitional area between the highland and the lowland. - Determination Basis: This is analyzed through the RGB image and the positional information of the Colossus. - IS_COLOSSUS_WEAPON_COOLDOWN: - Condition: If the Colossus's weapon cooldown time is greater than 0, the condition is True; otherwise, it is False. - Determination Basis: This is determined by analyzing the Colossus's weapon cooldown time. - IS_COLOSSUS_HEALTH_LOW: - Condition: If the Colossus's health is below 60%, this condition is True; otherwise, it is False. - Determination Basis: Determined by observing the Colossus's health. 4.2 Action Decision Rules - Special Scenarios: rule1 :As long as the weapon is in cooldown, the priority of movement actions is higher than attack actions. Movement > Attack. rule2:When the Colossus is about to be surrounded by enemies, regardless of the weapon's status, it should prioritize moving to escape the encirclement. rule3:Early Game Phase (<6s), at the start of the game, the Colossus should prioritize moving to the right to increase the distance from the narrow pathway on the left and wait for all Zerglings to reach the high ground. During this phase, movement takes priority over attacking. - Situation 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should immediately attack enemies, leveraging its ranged advantage to maximize damage. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should create distance from the enemies to avoid being approached and attack once the weapon cooldown ends. - Situation 2: The Colossus and Zerglings are in the same area - Enemies are near and about to attack (IS_ZERG_NEAR = True, Zerglings approaching): - Priority: Move. Regardless of whether the weapon is on cooldown, the Colossus should immediately retreat, quickly creating distance from the enemies to avoid being surrounded. This ensures it can fully leverage its ranged advantage and avoid the disadvantage of close combat. Once out of danger, it can then look for opportunities to counterattack. - Enemies are not near (IS_ZERG_NEAR = False): - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should take the opportunity to attack enemies and weaken their forces. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should reposition to maintain a safe distance while waiting for the weapon cooldown to end. - Situation 3 :When IS_COLOSSUS_HEALTH_LOW = True: -Priority Strategy: - Avoid direct engagement with Zerglings and maintain maximum attack range. - Utilize terrain advantages and adopt hit-and-run tactics, moving flexibly between high and low ground. - Priority: Movement > Attack
5 Integrated Strategy and Decision Framework
1. Strategy Suggestions
text1.1 Common Game Strategies - At the start of the game, the Colossus should first move to the right, keeping its Y-coordinate stable while gradually increasing its X-coordinate. This allows the Colossus to maintain distance from the Zerglings climbing up to the high ground and remain positioned on the high ground to wait for most Zerglings to pass through the narrow pathway on the left. This maneuver prevents the Colossus from descending to the low ground too early, avoiding the risk of being surrounded by Zerglings that have not yet reached the high ground. - Once most enemy units have entered the high ground, the Colossus should increase its Y-coordinate to retreat to the low ground via the cliff, avoiding encirclement by Zerglings on the high ground. 1.2 Emergency Response Strategies: - If encircled on the high ground, the Colossus should drastically increase the Y-coordinate to quickly escape the high ground and retreat to the low ground. - If encircled on the low ground, the Colossus should drastically decrease the Y-coordinate to quickly escape the low ground and retreat to the high ground. 2. Movement and Attack Decision 2.1 When Movement Is Required: - Phase Judgment: - Determine the current phase based on game time and the positions of both allied and enemy units. - Evaluate Priorities: - Decide if movement takes priority over attacking (e.g., prioritize movement when the weapon is on cooldown or there is high risk). - Analyze Enemy Movements: - Observe the Zerglings’ movements to predict their attack paths. - Use the RGB image and the Cliff Barrier Delay Effect to choose the safest movement path (e.g., move down-right or up-right). - Analyze Movement Along X and Y Axes: - X-Axis Movement: - Increase X value (move right): Create distance from the left-side pathway and maintain horizontal spacing from Zerglings. - Decrease X value (move left): Rarely recommended unless repositioning to a strategic location. - Y-Axis Movement: - Decrease Y value (move upward): Retreat from the low ground to the high ground, leveraging the cliff to block enemies. - Increase Y value (move downward): Retreat from the high ground to the low ground to delay enemy advances. 2.2 When Attacking Is Required: - Target Selection: - Prioritize the densest groups of enemies to maximize splash damage. - Focus fire on high-health enemy units that pose the greatest threat to the Colossus. - Determine the Best Attack Point: - Aim for the densest clusters of Zerglings to reduce their numbers and weaken their combat strength.
6.Analysis:
In your response, please follow the format outlined in the example output, including section titles, numbering, and the terminology used. Ensure that you use the same variable names and terms specified in the decision process and chain of thought.
textThe output should include: 1. Decision Process 1.1 Condition Assessment: - `IS_COLOSSUS_HIGH_GROUND = True/False` - `IS_ZERGLING_HIGH_GROUND = True/False` - `IS_ZERGLING_NEAR_COLOSSUS = True/False` - `IS_COLOSSUS_AT_CLIFF = True/False` - `IS_COLOSSUS_WEAPON_COOLDOWN = True/False` - `IS_COLOSSUS_HEALTH_LOW = True/False' 1.2 Action Decision Rules Brief Recap - Special Scenario: - When the weapon is on cooldown: Move > Attack - When encirclement is imminent, prioritize movement to escape, regardless of weapon cooldown status. - Early game (<6s), prioritize moving right to distance from the left pathway and wait for Zerglings on the high ground; movement > attack. - Scenario 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown: Prioritize attacking, leveraging range advantage to maximize damage. - Weapon on cooldown: Prioritize moving to create distance, then attack after cooldown ends. - Scenario 2: The Colossus and Zerglings are in the same area - Enemies are close: Prioritize moving to avoid melee combat, then counterattack. - Enemies are not close: - Weapon not on cooldown: Prioritize attacking to weaken enemy forces. - Weapon on cooldown: Prioritize moving to maintain a safe distance. - Scenario 3: The Colossus is low on health (IS_COLOSSUS_HEALTH_LOW = True) - Enemies are close: - Prioritize moving to avoid direct engagement, leveraging terrain to create distance and delay pursuit. - Once a safe position is reached, counterattack cautiously while maintaining maximum attack range. - Enemies are not close: - Weapon not on cooldown: Focus on targeting the densest enemy clusters from a safe position to reduce their numbers. - Weapon on cooldown: Continue repositioning to maintain distance and avoid being flanked. 1.3 Judgment Based on 4.2 Action Decision Rules: -Provide movement and attack priorities, e.g., attack > move or move > attack. -Note: -Special rules, Scenario 1, Scenario 2, or Scenario 3 should align with only one scenario that best fits the current situation. -Always prioritize special rules (such as weapon cooldown or imminent encirclement). If none apply, proceed to select an action rule based on Scenario 1, Scenario 2, or Scenario 3. 2 Integrated Strategy and Decision Framework 2.1 Strategy Suggestions - Current Phase Determination: Dynamically determine the strategy based on time, the Colossus’s position, and Zergling distribution. Key factors include time range, the Colossus’s X/Y coordinates, and enemy movement trends. - Consider Common Action Recommendations:: - Early Game Observation Phase: Initially, keep the X-coordinate steady while gradually increasing the Y-coordinate to monitor enemy movements. Utilize the Colossus’s range advantage to weaken enemy forces. - Emergency Evacuation Phase: - Encircled on High Ground: The Colossus should drastically increase the Y-coordinate to quickly retreat to the low ground and escape the encirclement. - Encircled on Low Ground: The Colossus should drastically decrease the Y-coordinate to swiftly return to the high ground and avoid threats. - Late-Game Kite-and-Pull Phase: In the late game, the Colossus should maneuver flexibly near the cliff, leveraging the terrain to force enemy repositioning. Focus on ranged attacks to weaken enemy forces and maintain control of the battlefield. 2.2 Movement Decision Analysis - Enemy Approach Path Prediction: Analyze the primary approach paths of enemy units and identify potential threats. - Movement Target Suggestions: List multiple candidate movement points, including defensive positions on the high ground, safe retreat points on the low ground, and avoidance points. Specify the most suitable movement target (e.g., a safe spot on the high ground or near the cliff on the low ground). Explain the reasoning behind the selection (e.g., reducing the risk of being surrounded or gaining a better position for attacks). 2.3 Attack Decision Analysis - Target Prioritization: Prioritize attacking dense enemy clusters to maximize splash damage. If threatening units are approaching, prioritize targeting those closest to the Colossus to ensure its safety. - Attack Target Selection: Identify multiple candidate attack points (e.g., the center of a dense enemy group or the closest group to the Colossus). Specify the optimal attack point and provide reasoning (e.g., to maximize damage or avoid being overwhelmed). 7.Action Output: Team Colossus-1: Action steps. """ self.eip = \ """ Game Info Time: 0:05 Team Colossus-1 Info: Team minimap position: [31, 27] Controlled Team Units: Unit: Colossus, Tag: 0x100000001, ScreenPos: [150, 123], Health: 350 (100%), Weapon Cooldown Time: 0.91s Nearby Enemy Units: Unit: Zergling, Tag: 0x1002c0001, ScreenPos: [52, 164], Distance: 10, Health: 6 (17%) Unit: Zergling, Tag: 0x100180001, ScreenPos: [79, 159], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100440001, ScreenPos: [112, 140], Distance: 4, Health: 35 (100%) Unit: Zergling, Tag: 0x100640001, ScreenPos: [38, 176], Distance: 12, Health: 21 (60%) Unit: Zergling, Tag: 0x100700001, ScreenPos: [52, 178], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100780001, ScreenPos: [42, 169], Distance: 11, Health: 6 (17%) Unit: Zergling, Tag: 0x100540001, ScreenPos: [49, 171], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100500001, ScreenPos: [91, 146], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100800001, ScreenPos: [83, 147], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100280001, ScreenPos: [96, 151], Distance: 6, Health: 21 (60%) Unit: Zergling, Tag: 0x1000c0001, ScreenPos: [59, 162], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x100040001, ScreenPos: [93, 157], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100340001, ScreenPos: [67, 172], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x1001c0001, ScreenPos: [72, 161], Distance: 8, Health: 6 (17%) Unit: Zergling, Tag: 0x100600001, ScreenPos: [76, 152], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100140001, ScreenPos: [86, 162], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x1004c0001, ScreenPos: [117, 145], Distance: 3, Health: 35 (100%) Unit: Zergling, Tag: 0x100400001, ScreenPos: [66, 164], Distance: 9, Health: 21 (60%) Unit: Zergling, Tag: 0x100200001, ScreenPos: [78, 166], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100240001, ScreenPos: [104, 144], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100100001, ScreenPos: [59, 172], Distance: 10, Health: 35 (100%) Unit: Zergling, Tag: 0x1003c0001, ScreenPos: [86, 154], Distance: 7, Health: 6 (17%) Unit: Zergling, Tag: 0x100480001, ScreenPos: [46, 159], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x1006c0001, ScreenPos: [69, 155], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100580001, ScreenPos: [59, 154], Distance: 9, Health: 6 (17%) Unit: Zergling, Tag: 0x100740001, ScreenPos: [130, 134], Distance: 2, Health: 35 (100%) Unit: Zergling, Tag: 0x1005c0001, ScreenPos: [97, 142], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100380001, ScreenPos: [39, 183], Distance: 12, Health: 35 (100%) Unit: Zergling, Tag: 0x100080001, ScreenPos: [106, 151], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100300001, ScreenPos: [124, 137], Distance: 2, Health: 35 (100%) Relevant Knowledge: Protoss.Colossus The large quad-legged vehicle fires lasers in a splash pattern well-suited to destroying swarms of weaker units. This unit can also traverse differences in terrain height due to its long legs, and will appear to step over ledges and other obstacles due to the inverse kinematics system. Unit properties: ['ground', 'air', 'armored', 'massive', 'mechanical'] Weapon info: Attack Range 7, target: ['ground'], anti: ['light'], DPS(damage per second) 13, DPS-anti 20 Unit abilities: Zerg.Zergling Fast but weak melee attacker ideal for swarming attacks in large numbers. Unit properties: ['ground', 'light', 'biological'] Weapon info: Attack Range 1, target: ['ground'], DPS(damage per second) 10 Valid Actions: Team Colossus-1 Valid Actions: <Attack_Unit(tag)> <Move_Screen(screen)> Action Args: (1) tag: tag refers to a hexadecimal number, shape as 0x000000000. (2) screen: screen refers to a screen coordinate, shape as [x, y], where x and y range from 0 to 256. (3) minimap: minimap refers to a minimap coordinate, shape as [x, y], where x and y range from 0 to 64. For example, when you want to use an action like <Action_Name(tag, screen)>, you should output like <Action_Name(0x100580001, [37, 55])>; when you want to use an action like <Action_Name(screen)>, you should output like <Action_Name([66, 78])>. Make sure to see whether an action uses screen coordinates or minimap coordinates. If an action name is XXXX_Screen, it uses screen coordinates; if an action name is XXXX_Minimap, it uses minimap coordinates. Last Step Actions: Team Colossus-1: <Move_Screen([160, 130])> <Attack_Unit(0x1007c0001)> <Attack_Unit(0x100740001)> You need to confirm whether the previous action finished executing, and based on this, determine whether to continue the old strategy or immediately take other actions. Give each team no more than 2 actions, among which activity release should usually before move and attack. Now, start generating your analysis and actions: """ self.eop = \ """
Analysis:
Decision Process
1.1 Condition Assessment
text- IS_COLOSSUS_HIGH_GROUND = True The Colossus is currently on the high ground, confirmed via the image and coordinates. - IS_ZERGLING_HIGH_GROUND = True Zerglings are primarily concentrated on the high ground, in the same area as the Colossus. - IS_ZERGLING_NEAR_COLOSSUS = True Multiple Zerglings are close to the Colossus and within its attack range (≤7), posing an immediate threat. - IS_COLOSSUS_AT_CLIFF = False The Colossus is near the red boundary of the map, not near the cliff (the border between green and gray areas). - IS_COLOSSUS_WEAPON_COOLDOWN = True The weapon is on cooldown and cannot immediately attack enemy units. - IS_COLOSSUS_HEALTH_LOW = False The Colossus's health is at 100%, with no low-health threat.
1.2 Action Decision Rules
Decision basis:
1. Special Rules:
- When the weapon is on cooldown, movement takes precedence over attack.
- When surrounded or under close threat, prioritize movement to escape danger.
text2. Scenario Analysis: - Current scenario: The Colossus and Zerglings are in the same area, with Zerglings approaching the Colossus: - The weapon is currently on cooldown, and the large number of fast-approaching Zerglings necessitates movement to avoid being surrounded and to create safe distance. - Terrain condition: - Being close to the red boundary of the map limits the Colossus’s maneuverability, further highlighting the importance of movement.
1.3 Judgment Based on Action Decision Rules
Integrated Strategy and Decision Framework
2.1 Strategy Suggestions
textCurrent Strategy: Emergency Retreat to Low Ground - The Colossus and Zerglings are both on the high ground. Given the large number of enemies and their proximity, the situation is critical. - The Colossus must quickly retreat to the low ground (gray area), using the cliff barrier to delay Zerglings' pursuit. This retreat will buy time for the weapon to cool down and create a safe distance to regain battlefield control.
2.2 Movement Decision Analysis
-Enemy Approach Path Prediction
- Zerglings are advancing from the left pathway on the high ground, attempting to encircle the Colossus.
- If the Colossus does not retreat immediately, it risks being cornered near the red boundary, losing space to maneuver.
Movement Target Suggestions
Reason:
2.3 Attack Decision Analysis
text- Target Prioritization: - Once the weapon cooldown ends, prioritize attacking the nearest Zerglings to reduce immediate threats and maximize splash damage. - Attack Target Selection: - The closest Zerglings (e.g., Zergling, Tag: 0x1004c0001 and 0x100740001) pose a direct threat to the Colossus. - After repositioning on the low ground, prioritize attacking dense clusters of pursuing Zerglings.
3 .Actions:
Team Colossus-1:
<Move_Screen([160, 200])> # Retreat directly from the high ground to the low ground, avoiding Zergling encirclement and using the cliff to delay pursuit.
<Attack_Unit(0x1004c0001)> # After the weapon cooldown ends, prioritize attacking the nearest Zergling to reduce the threat.
<Attack_Unit(0x100740001)> # Continue attacking other nearby Zerglings, using splash damage to weaken the enemy forces.
"""
text# Part 2 if self.config.ENABLE_COMMUNICATION: self.sp += \ """ 4.Communication Output If there is Available Communicate Target, you should keep communicating with them by Communication functions. For example, if 'Commander' and 'CombatGroup4' in Available Communicate Target, you can output as: Communications: <MessageTo(Commander, '''xxxxxxxxxx''')> <MessageTo(CombatGroup4, '''xxxxxxxxxx''')> """ self.eip += \ """ Communication: From Commander: Your task is to attack the enemy workers of an enemy base near minimap [48,32]. Intelligence shows that two enemy Queens are located on the minimap [44,32]. Try to avoid being detected by enemy Queens before arriving. Available Communication Tragets: Commander: Protoss military supreme commander. Responsible for making macro decision through communication, and controls nexus for massrecall for tactical objectives. Available Communication Functions: <MessageTo(AgentName, message)> <MessageTo(ChannelName, message)> <ListenTo(ChannelName)> Args explanation: (1)AgentName: refers to a name mentioned in Available Communication Tragets. (2)ChannelName: shape as Channel-i, i refers to an integer. (2)message: any text wrapped between ''' and '''. """ self.eop += \ """ Communications: <MessageTo(Commander, '''Copy that, we have arrived enemy base, and started attack enemy workers''')> """ # Part 3 self.eip += \ f"""
Give each team no more than {self.config.MAX_NUM_ACTIONS} actions.
Now, start generating your analysis and actions:
"""
class CommanderPrompt(BasePrompt): # TODO: Design a prompt specifically for the supreme military commander
def init(self, name, log_id, config):
super(CombatGroupPrompt, self).init()
self.name = name
self.config = config
self.log_id = log_id
# self.sp = ''
# self.eip = ''
# self.eop = ''
class DeveloperPrompt(BasePrompt): # TODO: Design a prompt specifically for the supreme logistics commander
def init(self, name, log_id, config):
super(CombatGroupPrompt, self).init()
self.name = name
self.config = config
self.log_id = log_id
# self.sp = ''
# self.eip = ''
# self.eop = ''
PROTOSS_FACTORY = {
'default': CombatGroupPrompt,
'commander': CommanderPrompt,
'developer': DeveloperPrompt,
}
TERRAN_FACTORY = {}
ZERG_FACTORY = {}
FACTORY = {
'protoss': PROTOSS_FACTORY,
'terran': TERRAN_FACTORY,
'zerg': ZERG_FACTORY,
}
if name == "main":
from llm_pysc2.agents.configs.config import ProtossAgentConfig
textconfig = ProtossAgentConfig() prompt = CombatGroupPrompt('CombatGroup1', log_id=0, config=config) print("--" * 25 + "System Prompt" + "--" * 25) print(prompt.sp) print("--" * 25 + "Example Input Prompt" + "--" * 25) print(prompt.eip) print("--" * 25 + "Example Output Prompt" + "--" * 25) print(prompt.eop)
您提供的代码片段展示了 Visual-CoT 框架在 LLM-PySC2 环境中的具体实现,特别是用于生成和管理提示(prompts)的部分。以下是对该代码的详细解析及其在整个 Visual-CoT 框架中的作用。
该代码主要定义了多个类,用于生成 Visual-CoT 框架所需的系统提示、示例输入和输出提示。这些提示指导大型语言模型(LLM)在 StarCraft II 的复杂环境中进行精细化动作控制和决策。以下是代码的主要组成部分:
pythonclass BasePrompt: def __init__(self): self.sp = '' self.eip = '' self.eop = '' self.screen_img_rgb_prompt = '' self.screen_img_fea_prompt = '' self.minimap_img_rgb_prompt = '' self.minimap_img_fea_prompt = ''
功能:
BasePrompt
定义了基础属性,包括系统提示 (sp
)、示例输入提示 (eip
)、示例输出提示 (eop
)、以及屏幕和小地图的 RGB 图像提示和特征图提示。pythonclass CombatGroupPrompt(BasePrompt): def __init__(self, name, log_id, config): super(CombatGroupPrompt, self).__init__() self.name = name self.config = config self.log_id = log_id # Part 1: System Prompt self.sp = f""" 1. Role Description You are a {self.config.AGENTS[self.name]['describe']}. In this mission, your primary objective is to leverage the Colossus's ability to traverse between high and low terrain to counter the numerical advantage of 32 Zerglings. Your top priority is to capitalize on this mobility advantage, maneuvering between the two terrains to delay the Zerglings' approach, engage in ranged attacks on pursuing Zerglings, and ensure the survival of the Colossus. 2.Map Information ... """ # Example Input Prompt self.eip = \ """ Game Info Time: 0:05 Team Colossus-1 Info: Team minimap position: [31, 27] Controlled Team Units: Unit: Colossus, Tag: 0x100000001, ScreenPos: [150, 123], Health: 350 (100%), Weapon Cooldown Time: 0.91s Nearby Enemy Units: Unit: Zergling, Tag: 0x1002c0001, ScreenPos: [52, 164], Distance: 10, Health: 6 (17%) ... """ # Example Output Prompt self.eop = \ """ Analysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True The Colossus is currently on the high ground, confirmed via the image and coordinates. - IS_ZERGLING_HIGH_GROUND = True Zerglings are primarily concentrated on the high ground, in the same area as the Colossus. ... 2. Integrated Strategy and Decision Framework 2.1 Strategy Suggestions Current Strategy: Emergency Retreat to Low Ground ... 3 .Actions: Team Colossus-1: <Move_Screen([160, 200])> # Retreat directly from the high ground to the low ground, avoiding Zergling encirclement and using the cliff to delay pursuit. <Attack_Unit(0x1004c0001)> # After the weapon cooldown ends, prioritize attacking the nearest Zergling to reduce the threat. <Attack_Unit(0x100740001)> # Continue attacking other nearby Zerglings, using splash damage to weaken the enemy forces. """ # Part 2: Communication Output (条件启用) if self.config.ENABLE_COMMUNICATION: self.sp += \ """ 4.Communication Output If there is Available Communicate Target, you should keep communicating with them by Communication functions. For example, if 'Commander' and 'CombatGroup4' in Available Communicate Target, you can output as: Communications: <MessageTo(Commander, '''xxxxxxxxxx''')> <MessageTo(CombatGroup4, '''xxxxxxxxxx''')> """ self.eip += \ """ Communication: From Commander: Your task is to attack the enemy workers of an enemy base near minimap [48,32]. Intelligence shows that two enemy Queens are located on the minimap [44,32]. Try to avoid being detected by enemy Queens before arriving. Available Communication Targets: Commander: Protoss military supreme commander. Responsible for making macro decision through communication, and controls nexus for mass recall for tactical objectives. Available Communication Functions: <MessageTo(AgentName, message)> <MessageTo(ChannelName, message)> <ListenTo(ChannelName)> Args explanation: (1)AgentName: refers to a name mentioned in Available Communication Targets. (2)ChannelName: shape as Channel-i, i refers to an integer. (2)message: any text wrapped between ''' and '''. """ self.eop += \ """ Communications: <MessageTo(Commander, '''Copy that, we have arrived enemy base, and started attack enemy workers''')> """ # Part 3: Additional Instructions self.eip += \ f""" Give each team no more than {self.config.MAX_NUM_ACTIONS} actions. Now, start generating your analysis and actions: """
功能:
CombatGroupPrompt
类继承自 BasePrompt
,并根据传入的 name
, log_id
, 和 config
参数初始化特定的提示内容。self.sp
):定义了角色描述、地图信息、关键胜利点、决策过程和综合战略决策框架。该部分详细描述了 Colossus 的任务目标、地图的地形特征、胜利策略以及决策规则。self.eip
):提供了当前游戏的详细信息,包括时间、团队位置、受控单位信息、附近敌方单位信息、相关知识和有效动作列表。这些信息帮助模型理解当前战场状态。self.eop
):展示了预期的分析和行动输出,包括决策过程、综合战略建议、动作决策分析以及具体的行动指令。通信部分:
config.ENABLE_COMMUNICATION
为 True
时,代码会额外添加通信输出部分,定义了与指挥官或其他战斗组的通信方式和内容。这允许模型在需要时与其他代理进行信息交流,增强多智能体协作能力。额外指令:
pythonclass CommanderPrompt(BasePrompt): # TODO: Design a prompt specifically for the supreme military commander def __init__(self, name, log_id, config): super(CombatGroupPrompt, self).__init__() self.name = name self.config = config self.log_id = log_id # self.sp = '' # self.eip = '' # self.eop = '' class DeveloperPrompt(BasePrompt): # TODO: Design a prompt specifically for the supreme logistics commander def __init__(self, name, log_id, config): super(CombatGroupPrompt, self).__init__() self.name = name self.config = config self.log_id = log_id # self.sp = '' # self.eip = '' # self.eop = ''
功能:
CommanderPrompt
和 DeveloperPrompt
)目前是占位符,标有 TODO
注释,表明未来需要为指挥官和物流指挥官设计特定的提示内容。BasePrompt
,未来可以根据不同角色的职责和需求,定义各自的系统提示、示例输入和输出提示。pythonPROTOSS_FACTORY = { 'default': CombatGroupPrompt, 'commander': CommanderPrompt, 'developer': DeveloperPrompt, } TERRAN_FACTORY = {} ZERG_FACTORY = {} FACTORY = { 'protoss': PROTOSS_FACTORY, 'terran': TERRAN_FACTORY, 'zerg': ZERG_FACTORY, }
功能:
pythonif __name__ == "__main__": from llm_pysc2.agents.configs.config import ProtossAgentConfig config = ProtossAgentConfig() prompt = CombatGroupPrompt('CombatGroup1', log_id=0, config=config) print("--" * 25 + "System Prompt" + "--" * 25) print(prompt.sp) print("--" * 25 + "Example Input Prompt" + "--" * 25) print(prompt.eip) print("--" * 25 + "Example Output Prompt" + "--" * 25) print(prompt.eop)
功能:
ProtossAgentConfig
配置类,初始化 config
对象。CombatGroupPrompt
实例,指定名称为 'CombatGroup1'
,日志 ID 为 0
,并传入配置对象。CombatGroupPrompt
类,框架能够生成详细的系统提示、示例输入和输出提示。这些提示指导 LLM 在特定的战斗场景中进行分析和决策。以下是您提供的 Example Output Prompt
的示例输出:
plaintextAnalysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True The Colossus is currently on the high ground, confirmed via the image and coordinates. - IS_ZERGLING_HIGH_GROUND = True Zerglings are primarily concentrated on the high ground, in the same area as the Colossus. - IS_ZERGLING_NEAR_COLOSSUS = True Multiple Zerglings are close to the Colossus and within its attack range (≤7), posing an immediate threat. - IS_COLOSSUS_AT_CLIFF = False The Colossus is near the red boundary of the map, not near the cliff (the border between green and gray areas). - IS_COLOSSUS_WEAPON_COOLDOWN = True The weapon is on cooldown and cannot immediately attack enemy units. - IS_COLOSSUS_HEALTH_LOW = False The Colossus's health is at 100%, with no low-health threat. 1.2 Action Decision Rules Decision basis: 1. Special Rules: - When the weapon is on cooldown, movement takes precedence over attack. - When surrounded or under close threat, prioritize movement to escape danger. 2. Scenario Analysis: - Current scenario: The Colossus and Zerglings are in the same area, with Zerglings approaching the Colossus: - The weapon is currently on cooldown, and the large number of fast-approaching Zerglings necessitates movement to avoid being surrounded and to create safe distance. - Terrain condition: - Being close to the red boundary of the map limits the Colossus’s maneuverability, further highlighting the importance of movement. 1.3 Judgment Based on Action Decision Rules - Prioritize matching the special rule: Weapon cooldown (Move > Attack). The current scenario matches the special rule, prioritizing movement to retreat, avoid being surrounded, and gain time for the weapon cooldown. - Move > Attack The high ground space is insufficient, and with enemies advancing and the weapon on cooldown, the Colossus must immediately retreat to the low ground, using the cliff to delay Zergling pursuit and buy time for the cooldown. 2. Integrated Strategy and Decision Framework 2.1 Strategy Suggestions Current Strategy: Emergency Retreat to Low Ground - The Colossus and Zerglings are both on the high ground. Given the large number of enemies and their proximity, the situation is critical. - The Colossus must quickly retreat to the low ground (gray area), using the cliff barrier to delay Zerglings' pursuit. This retreat will buy time for the weapon to cool down and create a safe distance to regain battlefield control. 2.2 Movement Decision Analysis -Enemy Approach Path Prediction - Zerglings are advancing from the left pathway on the high ground, attempting to encircle the Colossus. - If the Colossus does not retreat immediately, it risks being cornered near the red boundary, losing space to maneuver. - Movement Target Suggestions - Retreat directly to screen coordinates [160, 200], a safe area near the cliff on the low ground. - Crossing the cliff to the low ground will delay Zerglings' pursuit, providing more survival space and reducing the risk of melee attacks. - Reason: - Directly retreating to the low ground avoids encirclement by Zerglings. The cliff barrier delays enemy pursuit, ensuring the Colossus's safety. 2.3 Attack Decision Analysis - Target Prioritization: - Once the weapon cooldown ends, prioritize attacking the nearest Zerglings to reduce immediate threats and maximize splash damage. - Attack Target Selection: - The closest Zerglings (e.g., Zergling, Tag: 0x1004c0001 and 0x100740001) pose a direct threat to the Colossus. - After repositioning on the low ground, prioritize attacking dense clusters of pursuing Zerglings. 3 .Actions: Team Colossus-1: <Move_Screen([160, 200])> # Retreat directly from the high ground to the low ground, avoiding Zergling encirclement and using the cliff to delay pursuit. <Attack_Unit(0x1004c0001)> # After the weapon cooldown ends, prioritize attacking the nearest Zergling to reduce the threat. <Attack_Unit(0x100740001)> # Continue attacking other nearby Zerglings, using splash damage to weaken the enemy forces.
解析:
根据您提供的代码及其当前实现,以下是一些建议和未来的扩展方向,以进一步提升 Visual-CoT 框架的功能和适用性:
完善指挥官和物流指挥官的提示:
CommanderPrompt
和 DeveloperPrompt
类是占位符,建议根据不同角色的职责,设计详细的系统提示、示例输入和输出提示。例如,指挥官需要更多的宏观战略指导,而物流指挥官可能需要关注资源管理和单位部署。扩展至 Terran 和 Zerg 种族:
增强通信功能:
集成更先进的视觉处理技术:
跨领域应用:
优化提示设计:
您提供的 Visual-CoT 框架通过结合视觉信息和思维链推理,显著提升了大型语言模型在 StarCraft II 等复杂策略游戏中的动作控制和决策能力。通过详细设计系统提示、示例输入和输出提示,框架确保模型能够深入理解战场环境,制定有效的战略并执行精准的操作。未来,通过完善各类提示、扩展支持的种族和角色、以及集成更先进的技术,Visual-CoT 框架有望在多智能体协作和复杂决策任务中发挥更大的作用。
这个提示词是visual-cot的提示词,请你理解
D:\Anaconda\envs\starcraftii0828\python.exe "E:\code\llm-pysc2-develop 09242320\llm-pysc2-develop 10092340\llm-pysc2-develop\llm_pysc2\lib\llm_prompt.py"
pygame 2.6.0 (SDL 2.28.4, Python 3.9.19)
Hello from the pygame community. https://www.pygame.org/contribute.html
--------------------------------------------------System Prompt--------------------------------------------------
text1. Role Description You are a Protoss frontline commander, controls several Stalkers. Responsible for providing cover for the main force and restraining enemy forces.. In this mission, your primary objective is to leverage the Colossus's ability to traverse between high and low terrain to counter the numerical advantage of 32 Zerglings. Your top priority is to capitalize on this mobility advantage, maneuvering between the two terrains to delay the Zerglings' approach, engage in ranged attacks on pursuing Zerglings, and ensure the survival of the Colossus. 2.Map Information 2.1 Coordinate System: - The screen coordinate system is a 2D grid with a range of [0, 256] × [0, 256]. - Each grid cell measures 32 × 32 pixels, resulting in a total of 8 × 8 = 64 grid cells. - The origin point (0, 0) is at the top-left corner, while (256, 256) is at the bottom-right corner. 2.2 Terrain Information: - Pathway (connecting the high ground and low ground): The map consists of an upper platform (high ground) and a lower platform (low ground), connected by a narrow pathway on the left side. - High Ground (light green area): The high ground is characterized by light green grass textures and is located on the upper platform. - Low Ground (dark gray area): The low ground is characterized by dark gray concrete textures and is located on the lower platform. - Impassable Area (black background): The edges of the map consist of a black background, which is impassable. The Colossus must avoid entering these areas. - Cliff (light gray vertical surface): The cliff separates the high ground and low ground. The Colossus can freely traverse the cliff, but Zerglings can only access the high ground through the narrow pathway on the left. Only the boundary between the high ground and low ground is considered a cliff; other deep red boundaries are impassable edges of the map. - Boundary: The boundaries between the high ground and the black background, as well as between the low ground and the black background, are considered impassable edges of the map. The boundaries are represented by deep red lines on the map. 3. Key Victory Points: 3.1 Maintain Maximum Attack Range: - Ideally, the Colossus should stay within its maximum attack range (range 7) to fully utilize its long-range attack advantage while minimizing the risk of melee attacks from enemies. 3.2 Focus Fire on Dense Enemy Clusters: - Target the center of the densest enemy groups to maximize splash damage and quickly reduce enemy numbers. 3.3 Cliff Barrier Delay Effect: The Cliff Barrier Delay Effect refers to leveraging the terrain characteristics of cliffs, where enemy units cannot directly cross the cliff, while allied units can freely traverse it. This forces the enemy to take a detour when transitioning between high and low ground, buying time for allied units to reposition, cool down weapons, and strike effectively. - When enemies are heavily concentrated on the high ground: - If allied units are also on the high ground but risk being surrounded, it is recommended to retreat to the bottom-right low ground. This utilizes the cliff as a barrier to delay enemy pursuit, enabling ranged attacks on the enemies while maintaining battlefield control. - When enemies are heavily concentrated on the low ground: - If allied units are also on the low ground but risk being surrounded, it is recommended to retreat to the top-right high ground. The cliff barrier forces enemies to take a detour, providing time for allied units to reposition and counterattack effectively.
4. Decision Process
Based on the observed data of the Colossus and Zerglings, as well as the RGB image of the screen, use the following decision rules to determine action priorities:
text4.1 Condition Assessment Rules: - IS_HIGH_GROUND: - Criteria: If the Colossus is on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by verifying the Colossus's position in the green area via RGB image and observation data. - IS_ZERG_HIGH_GROUND: - Criteria: If the Zerglings are on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by analyzing the Zerglings' positions via RGB image. - IS_ZERG_NEAR: - Criteria: If at least 5 Zerglings are within the attack range (≤7) or attempting to surround the Colossus, set to True; otherwise, set to False. - Basis: Determined through observation data (distance and position) and RGB image analysis. - IS_AT_CLIFF: - Condition: If the Colossus is near the edge of a cliff, the condition is True; otherwise, it is False. It is important to distinguish between a cliff and a boundary. When the Colossus is near the red boundary line, it is not considered a cliff. A cliff is the transitional area between the highland and the lowland. - Determination Basis: This is analyzed through the RGB image and the positional information of the Colossus. - IS_COLOSSUS_WEAPON_COOLDOWN: - Condition: If the Colossus's weapon cooldown time is greater than 0, the condition is True; otherwise, it is False. - Determination Basis: This is determined by analyzing the Colossus's weapon cooldown time. - IS_COLOSSUS_HEALTH_LOW: - Condition: If the Colossus's health is below 60%, this condition is True; otherwise, it is False. - Determination Basis: Determined by observing the Colossus's health. 4.2 Action Decision Rules - Special Scenarios: rule1 :As long as the weapon is in cooldown, the priority of movement actions is higher than attack actions. Movement > Attack. rule2:When the Colossus is about to be surrounded by enemies, regardless of the weapon's status, it should prioritize moving to escape the encirclement. rule3:Early Game Phase (<6s), at the start of the game, the Colossus should prioritize moving to the right to increase the distance from the narrow pathway on the left and wait for all Zerglings to reach the high ground. During this phase, movement takes priority over attacking. - Situation 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should immediately attack enemies, leveraging its ranged advantage to maximize damage. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should create distance from the enemies to avoid being approached and attack once the weapon cooldown ends. - Situation 2: The Colossus and Zerglings are in the same area - Enemies are near and about to attack (IS_ZERG_NEAR = True, Zerglings approaching): - Priority: Move. Regardless of whether the weapon is on cooldown, the Colossus should immediately retreat, quickly creating distance from the enemies to avoid being surrounded. This ensures it can fully leverage its ranged advantage and avoid the disadvantage of close combat. Once out of danger, it can then look for opportunities to counterattack. - Enemies are not near (IS_ZERG_NEAR = False): - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should take the opportunity to attack enemies and weaken their forces. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should reposition to maintain a safe distance while waiting for the weapon cooldown to end. - Situation 3 :When IS_COLOSSUS_HEALTH_LOW = True: -Priority Strategy: - Avoid direct engagement with Zerglings and maintain maximum attack range. - Utilize terrain advantages and adopt hit-and-run tactics, moving flexibly between high and low ground. - Priority: Movement > Attack
5 Integrated Strategy and Decision Framework
1. Strategy Suggestions
text1.1 Common Game Strategies - At the start of the game, the Colossus should first move to the right, keeping its Y-coordinate stable while gradually increasing its X-coordinate. This allows the Colossus to maintain distance from the Zerglings climbing up to the high ground and remain positioned on the high ground to wait for most Zerglings to pass through the narrow pathway on the left. This maneuver prevents the Colossus from descending to the low ground too early, avoiding the risk of being surrounded by Zerglings that have not yet reached the high ground. - Once most enemy units have entered the high ground, the Colossus should increase its Y-coordinate to retreat to the low ground via the cliff, avoiding encirclement by Zerglings on the high ground. 1.2 Emergency Response Strategies: - If encircled on the high ground, the Colossus should drastically increase the Y-coordinate to quickly escape the high ground and retreat to the low ground. - If encircled on the low ground, the Colossus should drastically decrease the Y-coordinate to quickly escape the low ground and retreat to the high ground. 2. Movement and Attack Decision 2.1 When Movement Is Required: - Phase Judgment: - Determine the current phase based on game time and the positions of both allied and enemy units. - Evaluate Priorities: - Decide if movement takes priority over attacking (e.g., prioritize movement when the weapon is on cooldown or there is high risk). - Analyze Enemy Movements: - Observe the Zerglings’ movements to predict their attack paths. - Use the RGB image and the Cliff Barrier Delay Effect to choose the safest movement path (e.g., move down-right or up-right). - Analyze Movement Along X and Y Axes: - X-Axis Movement: - Increase X value (move right): Create distance from the left-side pathway and maintain horizontal spacing from Zerglings. - Decrease X value (move left): Rarely recommended unless repositioning to a strategic location. - Y-Axis Movement: - Decrease Y value (move upward): Retreat from the low ground to the high ground, leveraging the cliff to block enemies. - Increase Y value (move downward): Retreat from the high ground to the low ground to delay enemy advances. 2.2 When Attacking Is Required: - Target Selection: - Prioritize the densest groups of enemies to maximize splash damage. - Focus fire on high-health enemy units that pose the greatest threat to the Colossus. - Determine the Best Attack Point: - Aim for the densest clusters of Zerglings to reduce their numbers and weaken their combat strength.
6.Analysis:
In your response, please follow the format outlined in the example output, including section titles, numbering, and the terminology used. Ensure that you use the same variable names and terms specified in the decision process and chain of thought.
textThe output should include: 1. Decision Process 1.1 Condition Assessment: - `IS_COLOSSUS_HIGH_GROUND = True/False` - `IS_ZERGLING_HIGH_GROUND = True/False` - `IS_ZERGLING_NEAR_COLOSSUS = True/False` - `IS_COLOSSUS_AT_CLIFF = True/False` - `IS_COLOSSUS_WEAPON_COOLDOWN = True/False` - `IS_COLOSSUS_HEALTH_LOW = True/False' 1.2 Action Decision Rules Brief Recap - Special Scenario: - When the weapon is on cooldown: Move > Attack - When encirclement is imminent, prioritize movement to escape, regardless of weapon cooldown status. - Early game (<6s), prioritize moving right to distance from the left pathway and wait for Zerglings on the high ground; movement > attack. - Scenario 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown: Prioritize attacking, leveraging range advantage to maximize damage. - Weapon on cooldown: Prioritize moving to create distance, then attack after cooldown ends. - Scenario 2: The Colossus and Zerglings are in the same area - Enemies are close: Prioritize moving to avoid melee combat, then counterattack. - Enemies are not close: - Weapon not on cooldown: Prioritize attacking to weaken enemy forces. - Weapon on cooldown: Prioritize moving to maintain a safe distance. - Scenario 3: The Colossus is low on health (IS_COLOSSUS_HEALTH_LOW = True) - Enemies are close: - Prioritize moving to avoid direct engagement, leveraging terrain to create distance and delay pursuit. - Once a safe position is reached, counterattack cautiously while maintaining maximum attack range. - Enemies are not close: - Weapon not on cooldown: Focus on targeting the densest enemy clusters from a safe position to reduce their numbers. - Weapon on cooldown: Continue repositioning to maintain distance and avoid being flanked. 1.3 Judgment Based on 4.2 Action Decision Rules: -Provide movement and attack priorities, e.g., attack > move or move > attack. -Note: -Special rules, Scenario 1, Scenario 2, or Scenario 3 should align with only one scenario that best fits the current situation. -Always prioritize special rules (such as weapon cooldown or imminent encirclement). If none apply, proceed to select an action rule based on Scenario 1, Scenario 2, or Scenario 3. 2 Integrated Strategy and Decision Framework 2.1 Strategy Suggestions - Current Phase Determination: Dynamically determine the strategy based on time, the Colossus’s position, and Zergling distribution. Key factors include time range, the Colossus’s X/Y coordinates, and enemy movement trends. - Consider Common Action Recommendations:: - Early Game Observation Phase: Initially, keep the X-coordinate steady while gradually increasing the Y-coordinate to monitor enemy movements. Utilize the Colossus’s range advantage to weaken enemy forces. - Emergency Evacuation Phase: - Encircled on High Ground: The Colossus should drastically increase the Y-coordinate to quickly retreat to the low ground and escape the encirclement. - Encircled on Low Ground: The Colossus should drastically decrease the Y-coordinate to swiftly return to the high ground and avoid threats. - Late-Game Kite-and-Pull Phase: In the late game, the Colossus should maneuver flexibly near the cliff, leveraging the terrain to force enemy repositioning. Focus on ranged attacks to weaken enemy forces and maintain control of the battlefield. 2.2 Movement Decision Analysis - Enemy Approach Path Prediction: Analyze the primary approach paths of enemy units and identify potential threats. - Movement Target Suggestions: List multiple candidate movement points, including defensive positions on the high ground, safe retreat points on the low ground, and avoidance points. Specify the most suitable movement target (e.g., a safe spot on the high ground or near the cliff on the low ground). Explain the reasoning behind the selection (e.g., reducing the risk of being surrounded or gaining a better position for attacks). 2.3 Attack Decision Analysis - Target Prioritization: Prioritize attacking dense enemy clusters to maximize splash damage. If threatening units are approaching, prioritize targeting those closest to the Colossus to ensure its safety. - Attack Target Selection: Identify multiple candidate attack points (e.g., the center of a dense enemy group or the closest group to the Colossus). Specify the optimal attack point and provide reasoning (e.g., to maximize damage or avoid being overwhelmed). 7.Action Output: Team Colossus-1: Action steps.
--------------------------------------------------Example Input Prompt--------------------------------------------------
textGame Info Time: 0:05 Team Colossus-1 Info: Team minimap position: [31, 27] Controlled Team Units: Unit: Colossus, Tag: 0x100000001, ScreenPos: [150, 123], Health: 350 (100%), Weapon Cooldown Time: 0.91s Nearby Enemy Units: Unit: Zergling, Tag: 0x1002c0001, ScreenPos: [52, 164], Distance: 10, Health: 6 (17%) Unit: Zergling, Tag: 0x100180001, ScreenPos: [79, 159], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100440001, ScreenPos: [112, 140], Distance: 4, Health: 35 (100%) Unit: Zergling, Tag: 0x100640001, ScreenPos: [38, 176], Distance: 12, Health: 21 (60%) Unit: Zergling, Tag: 0x100700001, ScreenPos: [52, 178], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100780001, ScreenPos: [42, 169], Distance: 11, Health: 6 (17%) Unit: Zergling, Tag: 0x100540001, ScreenPos: [49, 171], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100500001, ScreenPos: [91, 146], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100800001, ScreenPos: [83, 147], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100280001, ScreenPos: [96, 151], Distance: 6, Health: 21 (60%) Unit: Zergling, Tag: 0x1000c0001, ScreenPos: [59, 162], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x100040001, ScreenPos: [93, 157], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100340001, ScreenPos: [67, 172], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x1001c0001, ScreenPos: [72, 161], Distance: 8, Health: 6 (17%) Unit: Zergling, Tag: 0x100600001, ScreenPos: [76, 152], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100140001, ScreenPos: [86, 162], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x1004c0001, ScreenPos: [117, 145], Distance: 3, Health: 35 (100%) Unit: Zergling, Tag: 0x100400001, ScreenPos: [66, 164], Distance: 9, Health: 21 (60%) Unit: Zergling, Tag: 0x100200001, ScreenPos: [78, 166], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100240001, ScreenPos: [104, 144], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100100001, ScreenPos: [59, 172], Distance: 10, Health: 35 (100%) Unit: Zergling, Tag: 0x1003c0001, ScreenPos: [86, 154], Distance: 7, Health: 6 (17%) Unit: Zergling, Tag: 0x100480001, ScreenPos: [46, 159], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x1006c0001, ScreenPos: [69, 155], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100580001, ScreenPos: [59, 154], Distance: 9, Health: 6 (17%) Unit: Zergling, Tag: 0x100740001, ScreenPos: [130, 134], Distance: 2, Health: 35 (100%) Unit: Zergling, Tag: 0x1005c0001, ScreenPos: [97, 142], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100380001, ScreenPos: [39, 183], Distance: 12, Health: 35 (100%) Unit: Zergling, Tag: 0x100080001, ScreenPos: [106, 151], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100300001, ScreenPos: [124, 137], Distance: 2, Health: 35 (100%) Relevant Knowledge: Protoss.Colossus The large quad-legged vehicle fires lasers in a splash pattern well-suited to destroying swarms of weaker units. This unit can also traverse differences in terrain height due to its long legs, and will appear to step over ledges and other obstacles due to the inverse kinematics system. Unit properties: ['ground', 'air', 'armored', 'massive', 'mechanical'] Weapon info: Attack Range 7, target: ['ground'], anti: ['light'], DPS(damage per second) 13, DPS-anti 20 Unit abilities: Zerg.Zergling Fast but weak melee attacker ideal for swarming attacks in large numbers. Unit properties: ['ground', 'light', 'biological'] Weapon info: Attack Range 1, target: ['ground'], DPS(damage per second) 10 Valid Actions: Team Colossus-1 Valid Actions: <Attack_Unit(tag)> <Move_Screen(screen)> Action Args: (1) tag: tag refers to a hexadecimal number, shape as 0x000000000. (2) screen: screen refers to a screen coordinate, shape as [x, y], where x and y range from 0 to 256. (3) minimap: minimap refers to a minimap coordinate, shape as [x, y], where x and y range from 0 to 64. For example, when you want to use an action like <Action_Name(tag, screen)>, you should output like <Action_Name(0x100580001, [37, 55])>; when you want to use an action like <Action_Name(screen)>, you should output like <Action_Name([66, 78])>. Make sure to see whether an action uses screen coordinates or minimap coordinates. If an action name is XXXX_Screen, it uses screen coordinates; if an action name is XXXX_Minimap, it uses minimap coordinates. Last Step Actions: Team Colossus-1: <Move_Screen([160, 130])> <Attack_Unit(0x1007c0001)> <Attack_Unit(0x100740001)> You need to confirm whether the previous action finished executing, and based on this, determine whether to continue the old strategy or immediately take other actions. Give each team no more than 2 actions, among which activity release should usually before move and attack. Now, start generating your analysis and actions:
Give each team no more than 3 actions.
Now, start generating your analysis and actions:
--------------------------------------------------Example Output Prompt--------------------------------------------------
Analysis:
Decision Process
1.1 Condition Assessment
text- IS_COLOSSUS_HIGH_GROUND = True The Colossus is currently on the high ground, confirmed via the image and coordinates. - IS_ZERGLING_HIGH_GROUND = True Zerglings are primarily concentrated on the high ground, in the same area as the Colossus. - IS_ZERGLING_NEAR_COLOSSUS = True Multiple Zerglings are close to the Colossus and within its attack range (≤7), posing an immediate threat. - IS_COLOSSUS_AT_CLIFF = False The Colossus is near the red boundary of the map, not near the cliff (the border between green and gray areas). - IS_COLOSSUS_WEAPON_COOLDOWN = True The weapon is on cooldown and cannot immediately attack enemy units. - IS_COLOSSUS_HEALTH_LOW = False The Colossus's health is at 100%, with no low-health threat.
1.2 Action Decision Rules
Decision basis:
1. Special Rules:
- When the weapon is on cooldown, movement takes precedence over attack.
- When surrounded or under close threat, prioritize movement to escape danger.
text2. Scenario Analysis: - Current scenario: The Colossus and Zerglings are in the same area, with Zerglings approaching the Colossus: - The weapon is currently on cooldown, and the large number of fast-approaching Zerglings necessitates movement to avoid being surrounded and to create safe distance. - Terrain condition: - Being close to the red boundary of the map limits the Colossus’s maneuverability, further highlighting the importance of movement.
1.3 Judgment Based on Action Decision Rules
Integrated Strategy and Decision Framework
2.1 Strategy Suggestions
textCurrent Strategy: Emergency Retreat to Low Ground - The Colossus and Zerglings are both on the high ground. Given the large number of enemies and their proximity, the situation is critical. - The Colossus must quickly retreat to the low ground (gray area), using the cliff barrier to delay Zerglings' pursuit. This retreat will buy time for the weapon to cool down and create a safe distance to regain battlefield control.
2.2 Movement Decision Analysis
-Enemy Approach Path Prediction
- Zerglings are advancing from the left pathway on the high ground, attempting to encircle the Colossus.
- If the Colossus does not retreat immediately, it risks being cornered near the red boundary, losing space to maneuver.
Movement Target Suggestions
Reason:
2.3 Attack Decision Analysis
text- Target Prioritization: - Once the weapon cooldown ends, prioritize attacking the nearest Zerglings to reduce immediate threats and maximize splash damage. - Attack Target Selection: - The closest Zerglings (e.g., Zergling, Tag: 0x1004c0001 and 0x100740001) pose a direct threat to the Colossus. - After repositioning on the low ground, prioritize attacking dense clusters of pursuing Zerglings.
3 .Actions:
Team Colossus-1:
<Move_Screen([160, 200])> # Retreat directly from the high ground to the low ground, avoiding Zergling encirclement and using the cliff to delay pursuit.
<Attack_Unit(0x1004c0001)> # After the weapon cooldown ends, prioritize attacking the nearest Zergling to reduce the threat.
<Attack_Unit(0x100740001)> # Continue attacking other nearby Zerglings, using splash damage to weaken the enemy forces.
进程已结束,退出代码0
您提供的提示词是 Visual-CoT 框架在 LLM-PySC2 环境中用于指导大型语言模型(LLM)在《星际争霸 II》游戏中的具体行动和决策的关键组成部分。以下是对该提示词的详细解析,帮助您全面理解其结构和功能。
每个部分在提示词中起到不同的作用,共同引导模型进行有效的决策和行动生成。
系统提示部分为模型提供了任务背景、环境信息、关键战略点以及决策规则。具体内容包括:
角色描述
地图信息
[0, 256] × [0, 256]
,每个网格单元为 32 × 32
像素,总共有 8 × 8 = 64
个网格单元。关键胜利点
决策过程
条件评估规则
动作决策规则
示例输入部分提供了当前游戏的具体状态,帮助模型理解当前战场环境,内容包括:
游戏信息
0:05
表示游戏进行到5秒。团队信息
Team minimap position: [31, 27]
。Unit: Colossus, Tag: 0x100000001, ScreenPos: [150, 123], Health: 350 (100%), Weapon Cooldown Time: 0.91s
。相关知识
有效动作
<Attack_Unit(tag)>
和 <Move_Screen(screen)>
。动作参数说明
上一步动作
额外指令
no more than 3 actions
)。示例输出部分展示了预期的分析和行动输出格式,内容包括:
分析部分
IS_COLOSSUS_HIGH_GROUND = True
。Move > Attack
。行动输出部分
plaintextTeam Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被Zergling包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的Zergling以减少威胁。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的Zergling,利用溅射伤害削弱敌军力量。
您提供的 Visual-CoT 提示词通过详细的系统提示、示例输入和输出,构建了一个全面的决策和行动生成框架。其主要特点包括:
通过这种结构化的提示词设计,Visual-CoT 能够有效引导大型语言模型在复杂的《星际争霸 II》环境中进行高效、精确的决策和行动执行。
如果您有进一步的需求,例如优化提示词、扩展其他角色或种族的提示,或是集成更多的视觉信息处理技术,请随时告知,我将为您提供更详细的建议和支持。
请你帮助我分析,有什么提高精简的空间,有什么合适的思路呢?
您提供的 Visual-CoT 提示词结构详尽,涵盖了角色描述、地图信息、关键胜利点、决策规则、战略框架以及具体的分析和行动指令。这种详细性有助于确保模型理解任务和环境,但在实际应用中,过于冗长和复杂的提示词可能导致以下问题:
以下是针对提示词精简和优化的具体分析及建议:
主要组成部分:
系统提示(System Prompt)
示例输入提示(Example Input Prompt)
示例输出提示(Example Output Prompt)
冗余与复杂性:
当前问题:系统提示过于详尽,包含大量背景信息和详细规则,可能导致模型难以集中关注关键决策点。
优化建议:
示例优化:
plaintext1. 角色描述 你是 Protoss 前线指挥官,控制多个 Stalkers,负责为主力部队提供掩护并遏制敌军力量。主要目标是利用 Colossus 在高低地形间的移动能力,抵消 32 个 Zerglings 的数量优势,延迟其进攻,进行远程攻击,确保 Colossus 的生存。 2. 地图信息 - 坐标系统:屏幕坐标为 [0, 256] × [0, 256],每个网格单元 32 × 32 像素,共 8 × 8 个网格。 - 地形特征: - 高地(浅绿):上层平台。 - 低地(深灰):下层平台。 - 悬崖(浅灰):高低地分界,Colossus 可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7)。 - 集中火力攻击密集敌军群体。 - 利用悬崖延迟敌军追击,重新定位和攻击。
当前问题:决策规则部分内容丰富,但部分规则可以合并或简化,提高可读性和模型处理效率。
优化建议:
示例优化:
plaintext4. 决策过程 根据 Colossus 和 Zerglings 的数据及屏幕 RGB 图像,使用以下规则确定行动优先级: 4.1 条件评估 - IS_HIGH_GROUND: Colossus 是否在高地。 - IS_ZERG_HIGH_GROUND: Zerglings 是否在高地。 - IS_ZERG_NEAR: 是否有至少 5 个 Zerglings 在攻击范围内或试图包围 Colossus。 - IS_AT_CLIFF: Colossus 是否靠近悬崖边缘。 - IS_COLOSSUS_WEAPON_COOLDOWN: Colossus 的武器是否处于冷却中。 - IS_COLOSSUS_HEALTH_LOW: Colossus 的健康是否低于 60%。 4.2 行动决策规则 - **特殊情境**: 1. 武器冷却中:移动 > 攻击。 2. 即将被包围:优先移动逃离,无视武器状态。 3. 早期游戏 (<6秒):优先向右移动,远离左侧路径;移动 > 攻击。 - **情境1**:Colossus 和 Zerglings 在不同区域 - 武器未冷却:优先攻击,利用远程优势。 - 武器冷却中:优先移动,等待冷却结束后攻击。 - **情境2**:Colossus 和 Zerglings 在同一区域 - 敌人接近:优先移动,避免近战包围。 - 敌人不接近: - 武器未冷却:优先攻击,削弱敌军。 - 武器冷却中:优先移动,保持安全距离。 - **情境3**:Colossus 健康低(IS_COLOSSUS_HEALTH_LOW = True) - 敌人接近: - 优先移动,避免直接交战,利用地形创造距离。 - 安全位置后,谨慎反击,保持最大攻击范围。 - 敌人不接近: - 武器未冷却:优先攻击密集敌军。 - 武器冷却中:继续移动,保持距离。
当前问题:示例输入提示中包含大量单位信息和详细描述,可能使模型难以快速抓取关键信息。
优化建议:
示例优化:
plaintext--------------------------------------------------示例输入提示-------------------------------------------------- 游戏信息 时间:0:05 团队信息 - 团队位置:Minimap [31, 27] - 控制单位: - Colossus (Tag: 0x100000001): [150, 123], 350 HP (100%), 武器冷却: 0.91s 附近敌军 - Zergling (Tag: 0x1002c0001): [52, 164], 距离: 10, 6 HP (17%) - Zergling (Tag: 0x100180001): [79, 159], 距离: 7, 35 HP (100%) - Zergling (Tag: 0x100440001): [112, 140], 距离: 4, 35 HP (100%) - Zergling (Tag: 0x100640001): [38, 176], 距离: 12, 21 HP (60%) - ... (省略其他 Zerglings) 相关知识 - **Protoss.Colossus**: - 属性:['ground', 'air', 'armored', 'massive', 'mechanical'] - 武器:范围7,目标:['ground'], DPS: 13 - **Zerg.Zergling**: - 属性:['ground', 'light', 'biological'] - 武器:范围1,DPS: 10 有效动作 - <Attack_Unit(tag)> - <Move_Screen(screen)> 动作参数说明 - tag: 十六进制编号,如 0x100580001 - screen: 屏幕坐标 [x, y],0 ≤ x, y ≤ 256 - minimap: 小地图坐标 [x, y],0 ≤ x, y ≤ 64 上一步动作 - <Move_Screen([160, 130])> - <Attack_Unit(0x1007c0001)> - <Attack_Unit(0x100740001)> 额外指令 - 每个团队最多 3 个动作。 - 确认上一步动作是否完成,根据情况继续原策略或立即采取新行动。 现在,开始生成您的分析和行动:
当前问题:示例输出提示结构合理,但可以通过减少冗余描述和使用更清晰的格式进一步优化。
优化建议:
示例优化:
plaintext--------------------------------------------------示例输出提示-------------------------------------------------- Analysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True Colossus 位于高地,已通过图像和坐标确认。 - IS_ZERG_HIGH_GROUND = True Zerglings 主要集中在高地,与 Colossus 位于同一区域。 - IS_ZERG_NEAR_COLOSSUS = True 多个 Zerglings 靠近 Colossus,位于攻击范围内(≤7),构成即时威胁。 - IS_COLOSSUS_AT_CLIFF = False Colossus 靠近地图的红色边界,不靠近悬崖。 - IS_COLOSSUS_WEAPON_COOLDOWN = True 武器处于冷却中,无法立即攻击敌军。 - IS_COLOSSUS_HEALTH_LOW = False Colossus 健康状态良好(100%),无低健康威胁。 1.2 Action Decision Rules Brief Recap - **特殊情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏(<6秒):优先向右移动,移动 > 攻击 - **情境1**:Colossus 和 Zerglings 在不同区域 - 武器未冷却:优先攻击,利用远程优势 - 武器冷却中:优先移动,等待冷却结束后攻击 - **情境2**:Colossus 和 Zerglings 在同一区域 - 敌人接近:优先移动,避免近战包围 - 敌人不接近: - 武器未冷却:优先攻击 - 武器冷却中:优先移动 - **情境3**:Colossus 健康低(IS_COLOSSUS_HEALTH_LOW = True) - 敌人接近: - 优先移动,利用地形创造距离 - 安全位置后谨慎反击 - 敌人不接近: - 武器未冷却:优先攻击密集敌军 - 武器冷却中:继续移动,保持距离 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合情境2(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。 2. Integrated Strategy and Decision Framework 2.1 Strategy Suggestions **当前战略**:紧急撤退到低地 - Colossus 和 Zerglings 均位于高地,敌军数量众多且接近。 - 快速撤退到低地(灰色区域),利用悬崖障碍延迟 Zerglings 追击,为武器冷却争取时间,确保安全距离。 2.2 Movement Decision Analysis - **敌军进攻路径预测**: - Zerglings 从高地左侧狭窄路径进攻,试图包围 Colossus。 - 若不立即撤退,Colossus 将被困于红色边界,失去机动空间。 - **移动目标建议**: - 撤退到屏幕坐标 [160, 200],低地悬崖附近的安全区域。 - 通过悬崖进入低地,延迟 Zerglings 追击,增加生存空间,减少近战风险。 - **理由**: - 直接撤退到低地避免被包围,悬崖延迟敌军追击,确保 Colossus 安全。 2.3 Attack Decision Analysis - **目标优先级**: - 武器冷却结束后,优先攻击最近的 Zerglings,减少即时威胁,最大化溅射伤害。 - **攻击目标选择**: - 优先攻击最近的 Zergling(如 Tag: 0x1004c0001 和 0x100740001),减少直接威胁。 - 撤退至低地后,优先攻击密集的追击 Zerglings 群体。 3. Actions: Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zerglings,利用溅射伤害削弱敌军。
当前问题:某些信息如动作参数说明和上一步动作的详细内容在每次提示中重复出现。
优化建议:
示例优化:
plaintext--------------------------------------------------示例输入提示-------------------------------------------------- 游戏信息 时间:0:05 团队信息 - 团队位置:Minimap [31, 27] - 控制单位: - Colossus (Tag: 0x100000001): [150, 123], 350 HP (100%), 武器冷却: 0.91s 附近敌军 - Zergling (Tag: 0x1002c0001): [52, 164], 距离: 10, 6 HP (17%) - Zergling (Tag: 0x100180001): [79, 159], 距离: 7, 35 HP (100%) - ... (省略其他 Zerglings) 相关知识 - **Protoss.Colossus**: - 属性:['ground', 'air', 'armored', 'massive', 'mechanical'] - 武器:范围7,目标:['ground'], DPS: 13 - **Zerg.Zergling**: - 属性:['ground', 'light', 'biological'] - 武器:范围1,DPS: 10 有效动作 - <Attack_Unit(tag)> - <Move_Screen(screen)> 动作参数说明:参见[动作参数文档](link)。 上一步动作: - <Move_Screen([160, 130])> - <Attack_Unit(0x1007c0001)> - <Attack_Unit(0x100740001)> 额外指令 - 每个团队最多 3 个动作。 - 确认上一步动作是否完成,根据情况继续原策略或立即采取新行动。 现在,开始生成您的分析和行动:
当前问题:提示词中有大量重复性内容,如条件评估和决策规则。
优化建议:
示例优化:
plaintext# 模板化系统提示 System Prompt: 1. 角色描述 你是 Protoss 前线指挥官,控制多个 Stalkers。主要任务是 {任务目标}。 2. 地图信息 - 坐标系统:{坐标系统描述} - 地形特征:{地形描述} 3. 关键胜利点 - {胜利点1} - {胜利点2} - {胜利点3} 4. 决策过程 基于当前数据,使用以下规则确定行动优先级: 4.1 条件评估 - IS_HIGH_GROUND: {IS_HIGH_GROUND} - IS_ZERG_HIGH_GROUND: {IS_ZERG_HIGH_GROUND} - ... 4.2 行动决策规则 - {决策规则摘要} 5. 综合战略与决策框架 2.1 战略建议 - {战略建议1} - {战略建议2} 2.2 移动决策分析 - {移动分析} 2.3 攻击决策分析 - {攻击分析}
通过这种模板化设计,可以在程序层面自动填充动态内容,减少手动编辑的冗余工作。
当前问题:信息层级结构复杂,模型可能难以抓取核心信息。
优化建议:
示例优化:
plaintext--------------------------------------------------示例输出提示-------------------------------------------------- Analysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True - IS_ZERG_HIGH_GROUND = True - IS_ZERG_NEAR_COLOSSUS = True - IS_COLOSSUS_AT_CLIFF = False - IS_COLOSSUS_WEAPON_COOLDOWN = True - IS_COLOSSUS_HEALTH_LOW = False 1.2 Action Decision Rules Brief Recap - **特殊情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏(<6秒):优先向右移动,移动 > 攻击 - **情境1**:不同区域 - 武器未冷却:优先攻击 - 武器冷却中:优先移动 - **情境2**:同一区域 - 敌人接近:优先移动 - 敌人不接近: - 武器未冷却:优先攻击 - 武器冷却中:优先移动 - **情境3**:健康低 - 敌人接近:优先移动,谨慎反击 - 敌人不接近: - 武器未冷却:优先攻击 - 武器冷却中:优先移动 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合情境2(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。 2. Integrated Strategy and Decision Framework 2.1 Strategy Suggestions - **当前战略**:紧急撤退到低地 - **战略理由**:利用悬崖障碍延迟 Zerglings 追击,确保安全距离。 2.2 Movement Decision Analysis - **敌军进攻路径预测**: - Zerglings 从高地左侧狭窄路径进攻,试图包围 Colossus。 - **移动目标建议**: - 撤退到屏幕坐标 [160, 200],低地悬崖附近的安全区域。 2.3 Attack Decision Analysis - **目标优先级**: - 武器冷却结束后,优先攻击最近的 Zerglings。 - **攻击目标选择**: - 优先攻击 Tag: 0x1004c0001 和 0x100740001,减少直接威胁。 3. Actions: Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
通过使用占位符和动态填充内容,可以显著减少手动编写的冗余内容,提高提示词的灵活性和可维护性。
示例:
python# 使用模板引擎(如 Jinja2)生成提示词 from jinja2 import Template system_prompt_template = """ 1. 角色描述 你是 {{ role_description }},主要任务是 {{ main_objective }}。 2. 地图信息 - 坐标系统:{{ coordinate_system }} - 地形特征:{{ terrain_features }} 3. 关键胜利点 - {{ victory_point1 }} - {{ victory_point2 }} - {{ victory_point3 }} 4. 决策过程 基于当前数据,使用以下规则确定行动优先级: 4.1 条件评估 - IS_HIGH_GROUND: {{ IS_HIGH_GROUND }} - IS_ZERG_HIGH_GROUND: {{ IS_ZERG_HIGH_GROUND }} - ... 4.2 行动决策规则 - {{ decision_rules }} """ # 填充模板 filled_system_prompt = Template(system_prompt_template).render( role_description="Protoss 前线指挥官,控制多个 Stalkers", main_objective="利用 Colossus 在高低地形间的移动能力,抵消 32 个 Zerglings 的数量优势,延迟其进攻,进行远程攻击,确保 Colossus 的生存", coordinate_system="屏幕坐标为 [0, 256] × [0, 256],每个网格单元 32 × 32 像素,共 8 × 8 个网格", terrain_features="- 高地(浅绿):上层平台\n- 低地(深灰):下层平台\n- 悬崖(浅灰):高低地分界,Colossus 可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地\n- 边界(深红):地图边缘不可通行", victory_point1="保持最大攻击范围(范围7)", victory_point2="集中火力攻击密集敌军群体", victory_point3="利用悬崖延迟敌军追击,重新定位和攻击", IS_HIGH_GROUND=True, IS_ZERG_HIGH_GROUND=True, # ... 填充其他条件 decision_rules="1. 武器冷却中:移动 > 攻击\n2. 即将被包围:优先移动逃离\n3. 早期游戏(<6秒):优先向右移动,移动 > 攻击" )
避免使用过于复杂或冗长的句子,使用简洁明了的表达方式,使模型更容易理解和处理信息。
示例优化:
plaintext1. 角色描述 你是 Protoss 前线指挥官,控制多个 Stalkers。主要任务:利用 Colossus 在高低地形间移动,抵消 32 个 Zerglings 的数量优势,延迟其进攻,进行远程攻击,确保 Colossus 生存。 2. 地图信息 - 坐标系统:屏幕坐标 [0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格。 - 地形特征: - 高地(浅绿):上层平台。 - 低地(深灰):下层平台。 - 悬崖(浅灰):高低地分界,Colossus 可穿越,Zerglings 仅通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7)。 - 集中火力攻击密集敌军群体。 - 利用悬崖延迟敌军追击,重新定位和攻击。
将提示词分为不同层级的信息,确保模型优先处理高层级的关键信息,逐步深入了解详细内容。
示例优化:
plaintext1. 角色与任务 - 角色:Protoss 前线指挥官 - 任务:控制 Stalkers,利用 Colossus 移动能力对抗 32 个 Zerglings,确保生存。 2. 环境信息 - 坐标系统:[0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格。 - 地形: - 高地(浅绿):上层平台。 - 低地(深灰):下层平台。 - 悬崖(浅灰):高低地分界,Colossus 可穿越,Zerglings 仅通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 战略要点 - 保持攻击范围 - 集中火力 - 利用地形延迟敌军追击 4. 决策规则 4.1 条件评估 - IS_HIGH_GROUND: True - IS_ZERG_HIGH_GROUND: True - IS_ZERG_NEAR_COLOSSUS: True - IS_COLOSSUS_WEAPON_COOLDOWN: True - ... 4.2 决策优先级 - 特殊情境优先:武器冷却中,移动 > 攻击 - 情境匹配: - 情境2:敌人接近,武器冷却中,优先移动 5. 战略与战术 - 当前战略:紧急撤退到低地 - 移动目标:[160, 200] - 攻击目标:Tag: 0x1004c0001, Tag: 0x100740001 6. 行动输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 攻击另一个 Zergling。
通过上述分析和优化建议,您可以精简 Visual-CoT 提示词,使其更加简洁、高效,同时保持必要的详细信息以确保模型的准确性和决策质量。关键优化点包括:
这些优化将有助于提升提示词的有效性和模型的决策效率,同时降低维护复杂性。若您有进一步的需求或需要更具体的优化建议,请随时告知!
第一:现在是1巨像对抗32小狗的情况,请结合星际争霸2的这方面知识来思考。
第二:我希望能够在此基础上改进再发一篇论文,方法或者说提示词内容有什么合适的改进思路,可以结合思维链方面思考。
Visual-CoT 框架旨在通过整合视觉信息与思维链(Chain of Thought, CoT)方法,提升大型语言模型(LLM)在复杂策略游戏如《星际争霸 II》(StarCraft II)中的决策与动作控制能力。当前,框架在 1 个 Protoss 巨像(Colossus)对抗 32 个 Zerglings(小狗)的场景下表现出一定的效果,但仍存在优化与精简的空间。
在《星际争霸 II》中,巨像 是一款高生命值、高伤害的远程地面单位,擅长对抗大量轻型地面单位如 Zerglings。然而,32 个 Zerglings 具有高机动性和数量优势,能够通过包围和快速突袭对巨像造成威胁。因此,如何有效地利用巨像的优势,同时规避 Zerglings 的包围与快速突袭,是决策过程的关键。
优点:
缺点:
为了提升 Visual-CoT 框架在 1 巨像对抗 32 小狗场景中的表现,以下是具体的优化与精简建议:
目标:提取关键信息,删除冗余描述,使用简洁语言。
优化方案:
优化后的系统提示示例:
plaintext1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭 32 个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。
目标:合并相似规则,使用条件嵌套提升逻辑清晰度。
优化方案:
优化后的决策规则示例:
plaintext4. 决策过程 根据 Colossus 和 Zerglings 的数据及屏幕 RGB 图像,使用以下规则确定行动优先级: 4.1 条件评估 - IS_HIGH_GROUND: True/False - IS_ZERG_HIGH_GROUND: True/False - IS_ZERG_NEAR_COLOSSUS: True/False - IS_COLOSSUS_AT_CLIFF: True/False - IS_COLOSSUS_WEAPON_COOLDOWN: True/False - IS_COLOSSUS_HEALTH_LOW: True/False 4.2 行动决策规则 - **移动优先情境**: - 武器冷却中(IS_COLOSSUS_WEAPON_COOLDOWN = True) - 即将被包围(IS_ZERG_NEAR_COLOSSUS = True) - 早期游戏阶段(Time < 6s) - **攻击优先情境**: - 武器未冷却且敌人不接近 - 在安全位置时,优先攻击密集敌军 - **具体情境分析**: - **情境1**:Colossus 和 Zerglings 在不同区域 - 武器未冷却:优先攻击 - 武器冷却中:优先移动,等待冷却结束后攻击 - **情境2**:Colossus 和 Zerglings 在同一区域 - 敌人接近:优先移动,避免近战包围 - 敌人不接近: - 武器未冷却:优先攻击 - 武器冷却中:优先移动,保持安全距离 - **情境3**:Colossus 健康低(IS_COLOSSUS_HEALTH_LOW = True) - 敌人接近: - 优先移动,利用地形创造距离 - 安全位置后谨慎反击 - 敌人不接近: - 武器未冷却:优先攻击密集敌军 - 武器冷却中:继续移动,保持距离
目标:突出关键信息,减少冗余单位信息,使用结构化格式。
优化方案:
优化后的示例输入提示示例:
plaintext--------------------------------------------------示例输入提示-------------------------------------------------- 游戏信息 时间:0:05 团队信息 - 团队位置:Minimap [31, 27] - 控制单位: - Colossus (Tag: 0x100000001): [150, 123], 350 HP (100%), 武器冷却: 0.91s 附近敌军 - Zergling (Tag: 0x1002c0001): [52, 164], 距离: 10, 6 HP (17%) - Zergling (Tag: 0x100180001): [79, 159], 距离: 7, 35 HP (100%) - Zergling (Tag: 0x100440001): [112, 140], 距离: 4, 35 HP (100%) - ... (省略其他 Zerglings) 相关知识 - **Protoss.Colossus**: - 属性:['ground', 'air', 'armored', 'massive', 'mechanical'] - 武器:范围7,目标:['ground'], DPS: 13 - **Zerg.Zergling**: - 属性:['ground', 'light', 'biological'] - 武器:范围1,DPS: 10 有效动作 - <Attack_Unit(tag)> - <Move_Screen(screen)> 上一步动作: - <Move_Screen([160, 130])> - <Attack_Unit(0x1007c0001)> - <Attack_Unit(0x100740001)> 额外指令 - 每个团队最多 3 个动作。 - 确认上一步动作是否完成,根据情况继续原策略或立即采取新行动。 现在,开始生成您的分析和行动:
目标:减少冗余描述,使用清晰的层级结构。
优化方案:
优化后的示例输出提示示例:
plaintext--------------------------------------------------示例输出提示-------------------------------------------------- Analysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True Colossus 位于高地,已通过图像和坐标确认。 - IS_ZERG_HIGH_GROUND = True Zerglings 主要集中在高地,与 Colossus 位于同一区域。 - IS_ZERG_NEAR_COLOSSUS = True 多个 Zerglings 靠近 Colossus,位于攻击范围内(≤7),构成即时威胁。 - IS_COLOSSUS_AT_CLIFF = False Colossus 靠近地图的红色边界,不靠近悬崖。 - IS_COLOSSUS_WEAPON_COOLDOWN = True 武器处于冷却中,无法立即攻击敌军。 - IS_COLOSSUS_HEALTH_LOW = False Colossus 健康状态良好(100%),无低健康威胁。 1.2 Action Decision Rules Brief Recap - **特殊情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏(<6秒):优先向右移动,移动 > 攻击 - **情境1**:Colossus 和 Zerglings 在不同区域 - 武器未冷却:优先攻击 - 武器冷却中:优先移动,等待冷却结束后攻击 - **情境2**:Colossus 和 Zerglings 在同一区域 - 敌人接近:优先移动,避免近战包围 - 敌人不接近: - 武器未冷却:优先攻击 - 武器冷却中:优先移动,保持安全距离 - **情境3**:Colossus 健康低(IS_COLOSSUS_HEALTH_LOW = True) - 敌人接近: - 优先移动,利用地形创造距离 - 安全位置后谨慎反击 - 敌人不接近: - 武器未冷却:优先攻击密集敌军 - 武器冷却中:继续移动,保持距离 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合情境2(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。 2. Integrated Strategy and Decision Framework 2.1 Strategy Suggestions **当前战略**:紧急撤退到低地 - Colossus 和 Zerglings 均位于高地,敌军数量众多且接近。 - 快速撤退到低地(灰色区域),利用悬崖障碍延迟 Zerglings 追击,为武器冷却争取时间,确保安全距离。 2.2 Movement Decision Analysis - **敌军进攻路径预测**: - Zerglings 从高地左侧狭窄路径进攻,试图包围 Colossus。 - **移动目标建议**: - 撤退到屏幕坐标 [160, 200],低地悬崖附近的安全区域。 2.3 Attack Decision Analysis - **目标优先级**: - 武器冷却结束后,优先攻击最近的 Zerglings,减少即时威胁,最大化溅射伤害。 - **攻击目标选择**: - 优先攻击 Tag: 0x1004c0001 和 0x100740001,减少直接威胁。 - 撤退至低地后,优先攻击密集的追击 Zerglings 群体。 3. Actions: Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
目标:通过模板化设计减少重复内容,提升可维护性与灵活性。
优化方案:
示例代码:
pythonfrom jinja2 import Template # 定义模板 system_prompt_template = """ 1. 角色描述 你是 {{ role_description }}。主要任务是 {{ main_objective }}。 2. 地图信息 - 坐标系统:{{ coordinate_system }} - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - {{ victory_point1 }} - {{ victory_point2 }} - {{ victory_point3 }} 4. 决策过程 根据 Colossus 和 Zerglings 的数据及屏幕 RGB 图像,使用以下规则确定行动优先级: 4.1 条件评估 - IS_HIGH_GROUND: {{ IS_HIGH_GROUND }} - IS_ZERG_HIGH_GROUND: {{ IS_ZERG_HIGH_GROUND }} - IS_ZERG_NEAR_COLOSSUS: {{ IS_ZERG_NEAR_COLOSSUS }} - IS_COLOSSUS_AT_CLIFF: {{ IS_COLOSSUS_AT_CLIFF }} - IS_COLOSSUS_WEAPON_COOLDOWN: {{ IS_COLOSSUS_WEAPON_COOLDOWN }} - IS_COLOSSUS_HEALTH_LOW: {{ IS_COLOSSUS_HEALTH_LOW }} 4.2 行动决策规则 - **移动优先情境**: {% for rule in movement_priority_rules %} - {{ rule }} {% endfor %} - **攻击优先情境**: {% for rule in attack_priority_rules %} - {{ rule }} {% endfor %} """ # 填充模板 filled_system_prompt = Template(system_prompt_template).render( role_description="Protoss 前线指挥官,控制一个巨像", main_objective="利用巨像的远程攻击和高生命值优势,抵御并消灭 32 个 Zerglings 的数量优势,确保巨像的生存", coordinate_system="[0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格", victory_point1="保持最大攻击范围(范围7),利用远程优势", victory_point2="集中火力攻击密集敌军群体,最大化溅射伤害", victory_point3="利用悬崖延迟敌军追击,重新定位和攻击", IS_HIGH_GROUND=True, IS_ZERG_HIGH_GROUND=True, IS_ZERG_NEAR_COLOSSUS=True, IS_COLOSSUS_AT_CLIFF=False, IS_COLOSSUS_WEAPON_COOLDOWN=True, IS_COLOSSUS_HEALTH_LOW=False, movement_priority_rules=[ "武器冷却中:移动 > 攻击", "即将被包围:优先移动逃离", "早期游戏阶段(<6秒):优先向右移动,移动 > 攻击" ], attack_priority_rules=[ "武器未冷却且敌人不接近:优先攻击", "在安全位置时,优先攻击密集敌军" ] ) print(filled_system_prompt)
目标:确保模型优先处理高层级关键信息,逐步深入了解详细内容。
优化方案:
优化后的示例:
plaintext1. 角色与任务 - 角色:Protoss 前线指挥官,控制一个巨像。 - 任务:利用巨像的远程攻击和高生命值优势,抵御并消灭 32 个 Zerglings 的数量优势,确保巨像的生存。 2. 环境信息 - 坐标系统:[0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 战略要点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策规则 4.1 条件评估 - IS_HIGH_GROUND: True - IS_ZERG_HIGH_GROUND: True - IS_ZERG_NEAR_COLOSSUS: True - IS_COLOSSUS_AT_CLIFF: False - IS_COLOSSUS_WEAPON_COOLDOWN: True - IS_COLOSSUS_HEALTH_LOW: False 4.2 行动决策规则 - **移动优先情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 - **攻击优先情境**: 1. 武器未冷却且敌人不接近:优先攻击 2. 在安全位置时,优先攻击密集敌军 5. 战略与战术 - **当前战略**:紧急撤退到低地 - 原因:利用悬崖延迟 Zerglings 追击,为武器冷却争取时间,确保安全距离。 - **移动目标**:[160, 200] - 理由:撤退到低地,避免被包围,利用悬崖延迟追击。 - **攻击目标**: - Tag: 0x1004c0001 - Tag: 0x100740001 - 理由:优先攻击最近的 Zerglings,减少直接威胁。 6. 行动输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
为了进一步提升 Visual-CoT 框架在 1 巨像对抗 32 小狗场景中的表现,可以结合《星际争霸 II》的深层策略和战术知识,以下是具体的改进思路:
背景:《星际争霸 II》中,战场状况变化迅速,指挥官需要根据敌方行动实时调整战术。
改进方案:
示例:
plaintext4.2 行动决策规则 - **移动优先情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 - **攻击优先情境**: 1. 武器未冷却且敌人不接近:优先攻击 2. 在安全位置时,优先攻击密集敌军 - **动态战术调整**: 1. **敌军集中攻击**: - 加强防守,利用高地优势,集中攻击单一敌军群体。 2. **敌军分散攻击**: - 利用巨像的高 DPS,优先攻击分散敌军,减少其数量优势。
背景:在《星际争霸 II》中,视觉信息(如单位位置、地形特征)对决策影响巨大。
改进方案:
示例:
plaintext2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 - **视觉信息处理**: - 使用图像分割识别敌军位置和地形障碍。 - 目标检测识别敌军单位的健康状态和位置。
背景:强化学习(Reinforcement Learning, RL)在《星际争霸 II》中已展示出强大的策略学习能力。结合 CoT 可以提升模型的解释性与决策质量。
改进方案:
示例:
plaintext4. 决策过程 - 使用强化学习优化巨像的移动与攻击策略。 - 利用 CoT 提供每一步决策的解释,提升策略的可解释性。 4.3 强化学习策略 - 状态:Colossus 的位置、健康状态、武器冷却时间;Zerglings 的位置、数量、健康状态。 - 动作:移动到指定屏幕坐标,攻击指定 Zergling。 - 奖励:成功击杀 Zerglings,保持 Colossus 的生存,避免被包围。 4.4 CoT 解释 - 每次决策后,提供策略选择的逻辑解释,帮助理解 RL 优化的决策过程。
背景:在复杂战场中,多智能体协作能显著提升整体战术效果。
改进方案:
示例:
plaintext5. 多智能体协作 5.1 指挥与控制 - 与指挥官沟通,协调巨像与其他单位的行动。 - 通过通信函数,发布协同攻击或防守指令。 5.2 协同战术 - **主力攻击**:巨像集中火力攻击密集 Zerglings 群体。 - **辅助支援**:其他单位提供掩护或牵制敌军,确保巨像的攻击效率和生存能力。
背景:Visual-CoT 框架不仅适用于《星际争霸 II》,还能扩展至其他需要多模态信息整合和精细化控制的领域。
改进方案:
示例:
plaintext6. 跨领域应用 6.1 机器人控制 - 结合视觉信息与思维链,提升机器人在复杂环境中的导航与任务执行能力。 6.2 自动驾驶 - 使用 Visual-CoT 提升自动驾驶系统的环境感知与决策能力,确保安全与效率。 6.3 虚拟助手 - 通过多模态信息整合,提升虚拟助手在多任务环境中的响应与协作能力。
基于以上分析与优化建议,以下是改进后的论文撰写思路,重点在于方法与提示词内容的优化:
3.1 Visual-CoT 框架优化
3.2 视觉信息处理增强
通过上述分析与优化建议,Visual-CoT 框架在 1 巨像对抗 32 小狗的场景下将具备更高的决策效率与动作控制能力。结合《星际争霸 II》的深层策略知识、强化学习方法以及多智能体协作机制,能够显著提升模型在复杂战场环境中的表现。未来,随着技术的不断进步,Visual-CoT 框架有望在更多策略游戏及实际应用中发挥更大的作用,推动基于 LLM 的复杂决策方法的发展。
如果您有进一步的需求或具体问题,欢迎继续交流,我将为您提供更详细的支持和建议。
第一:我认为强化学习不可行
第二:思维链是不是可以调整,更加适合这个任务
请你给出你的想法。
感谢您提供的反馈。针对您的两个主要观点:
以下是基于这些观点的详细分析与优化建议。
原因分析:
替代方案:
思维链在 Visual-CoT 框架中扮演着至关重要的角色,通过引导模型逐步推理和决策。以下是针对思维链优化的具体建议:
目标:通过明确的层级和步骤,提高思维链的逻辑性和可读性。
实施方法:
示例:
plaintextAnalysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True Colossus 位于高地,已通过图像和坐标确认。 - IS_ZERG_HIGH_GROUND = True Zerglings 主要集中在高地,与 Colossus 位于同一区域。 - ... 1.2 Action Decision Rules Brief Recap - **特殊情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合情境2(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。
目标:利用《星际争霸 II》的专业知识,增强模型的决策能力。
实施方法:
示例:
plaintext4.2 行动决策规则 - **移动优先情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 - **攻击优先情境**: 1. 武器未冷却且敌人不接近:优先攻击 2. 在安全位置时,优先攻击密集敌军 - **领域特定规则**: 1. **利用高地优势**: - 如果 Colossus 位于高地,优先保持高地位置,利用视野和攻击范围优势。 2. **避免悬崖边缘**: - 如果 Colossus 靠近悬崖边缘,优先撤退到安全位置,避免被敌军利用地形夹击。 3. **集中火力攻击**: - 当敌军 Zerglings 数量达到一定阈值(如 ≥10),优先集中火力攻击密集群体,最大化溅射伤害。
目标:使思维链能够根据实时战场信息动态调整决策。
实施方法:
示例:
plaintext1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True - IS_ZERG_HIGH_GROUND = True - IS_ZERG_NEAR_COLOSSUS = True - IS_COLOSSUS_AT_CLIFF = False - IS_COLOSSUS_WEAPON_COOLDOWN = True - IS_COLOSSUS_HEALTH_LOW = False 1.2 Action Decision Rules Brief Recap - **移动优先情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合情境2(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。
目标:确保生成的动作符合游戏机制,并最大化策略效果。
实施方法:
示例:
plaintext3. Actions: Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
目标:通过具体案例分析,优化思维链的决策逻辑。
实施方法:
示例:
plaintext4.3 案例驱动优化 - **案例1**:Colossus 在高地成功击退 Zerglings - 决策过程: - 识别高地优势 - 保持高地位置,集中火力攻击密集 Zerglings - 成功因素: - 有效利用远程攻击优势 - 避免被包围 - **案例2**:Colossus 被 Zerglings 包围,导致损失 - 决策过程: - 未及时识别包围威胁 - 武器冷却中,未优先移动 - 失败因素: - 缺乏对包围威胁的快速反应 - 移动优先级设置不合理 - **优化措施**: - 加强对包围威胁的实时监控 - 在武器冷却中,始终优先考虑移动以避免被包围
目标:通过整合视觉信息与文本信息,提升模型对战场环境的理解与决策能力。
实施方法:
示例:
plaintext2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格 32 × 32 像素,共 8 × 8 个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings 只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 - **视觉信息处理**: - 使用图像分割识别敌军位置和地形障碍。 - 目标检测识别敌军单位的健康状态和位置。
基于上述优化建议,以下是改进后的论文撰写思路,重点在于方法与提示词内容的优化,特别是针对思维链的调整。
3.1 Visual-CoT 框架优化
3.2 视觉信息处理增强
通过优化思维链的结构和内容,结合《星际争霸 II》的深层策略知识,您可以显著提升 Visual-CoT 框架在 1 巨像对抗 32 小狗场景中的决策效率与行动控制能力。关键优化点包括:
这些优化将帮助模型在复杂战场环境中做出更高效、精准的决策,确保巨像在对抗大量 Zerglings 时的生存与胜利。如果您有进一步的需求或具体问题,欢迎继续交流,我将为您提供更详细的支持和建议。
在提示词中能不能有多思维链,灵活切换,提高推理结果。
在复杂的决策任务中,如《星际争霸 II》中的1巨像对抗32小狗的情境,单一的思维链(CoT)可能难以涵盖所有可能的战术与策略。引入多思维链(Multi-CoT),并实现灵活切换,可以显著提升模型的推理深度与决策质量。以下是关于如何在提示词中整合多思维链的详细建议和方法。
**多思维链(Multi-CoT)**指的是在同一任务中,模型能够运用多个独立或关联的思维链,每个思维链专注于不同的决策层面或情境。这种方法的优势包括:
要在提示词中有效整合多思维链,需要以下几个步骤:
首先,明确每个思维链的功能和适用情境。例如:
通过明确的条件评估和触发机制,指导模型在特定情境下选择相应的思维链。
示例结构:
plaintext### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: True/False - IS_ZERG_HIGH_GROUND: True/False - IS_ZERG_NEAR_COLOSSUS: True/False - IS_COLOSSUS_AT_CLIFF: True/False - IS_COLOSSUS_WEAPON_COOLDOWN: True/False - IS_COLOSSUS_HEALTH_LOW: True/False 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - 评估敌军位置和数量。 - 确定最佳攻击目标,优先攻击密集群体。 - 选择移动路径,保持在最大攻击范围内。 5.2 **战略思维链(Strategic CoT)** - 确定初始位置,优先向右移动以避开左侧路径。 - 规划撤退路线,利用悬崖延迟敌军追击。 - 监控敌军动向,调整位置以保持视野和攻击优势。 5.3 **应急思维链(Emergency CoT)** - 立即评估周围威胁,确定最安全的撤退方向。 - 优先移动到低地或安全区域,避免被包围。 - 在安全位置等待武器冷却结束,准备反击。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 7. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
通过条件触发机制,指导模型在特定情境下选择最合适的思维链进行推理和决策。
示例触发逻辑:
plaintext4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。
为每个思维链定义明确的推理路径和决策步骤,确保其在特定情境下的有效性。
示例内容:
plaintext5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - 评估敌军位置和数量。 - 确定最佳攻击目标,优先攻击密集群体。 - 选择移动路径,保持在最大攻击范围内。 5.2 **战略思维链(Strategic CoT)** - 确定初始位置,优先向右移动以避开左侧路径。 - 规划撤退路线,利用悬崖延迟敌军追击。 - 监控敌军动向,调整位置以保持视野和攻击优势。 5.3 **应急思维链(Emergency CoT)** - 立即评估周围威胁,确定最安全的撤退方向。 - 优先移动到低地或安全区域,避免被包围。 - 在安全位置等待武器冷却结束,准备反击。
确保思维链能够根据实时战场信息动态调整决策,提升适应性和灵活性。
实施方法:
示例:
plaintext1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True Colossus 位于高地,已通过图像和坐标确认。 - IS_ZERG_HIGH_GROUND = True Zerglings 主要集中在高地,与 Colossus 位于同一区域。 - IS_ZERG_NEAR_COLOSSUS = True 多个 Zerglings 靠近 Colossus,位于攻击范围内(≤7),构成即时威胁。 - IS_COLOSSUS_AT_CLIFF = False Colossus 靠近地图的红色边界,不靠近悬崖。 - IS_COLOSSUS_WEAPON_COOLDOWN = True 武器处于冷却中,无法立即攻击敌军。 - IS_COLOSSUS_HEALTH_LOW = False Colossus 健康状态良好(100%),无低健康威胁。 1.2 Action Decision Rules Brief Recap - **移动优先情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合情境2(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。
以下是整合多思维链后的优化提示词示例,展示如何在一个统一的框架中实现多个思维链的选择与应用:
plaintext### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: True/False - IS_ZERG_HIGH_GROUND: True/False - IS_ZERG_NEAR_COLOSSUS: True/False - IS_COLOSSUS_AT_CLIFF: True/False - IS_COLOSSUS_WEAPON_COOLDOWN: True/False - IS_COLOSSUS_HEALTH_LOW: True/False 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - 评估敌军位置和数量。 - 确定最佳攻击目标,优先攻击密集群体。 - 选择移动路径,保持在最大攻击范围内。 5.2 **战略思维链(Strategic CoT)** - 确定初始位置,优先向右移动以避开左侧路径。 - 规划撤退路线,利用悬崖延迟敌军追击。 - 监控敌军动向,调整位置以保持视野和攻击优势。 5.3 **应急思维链(Emergency CoT)** - 立即评估周围威胁,确定最安全的撤退方向。 - 优先移动到低地或安全区域,避免被包围。 - 在安全位置等待武器冷却结束,准备反击。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 7. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
通过将不同的思维链模块化,模型可以根据当前情境选择最合适的思维链进行推理。
示例:
plaintext5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 5.2 **战略思维链(Strategic CoT)** - Step 1: 确定初始位置,优先向右移动以避开左侧路径。 - Step 2: 规划撤退路线,利用悬崖延迟敌军追击。 - Step 3: 监控敌军动向,调整位置以保持视野和攻击优势。 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。
在决策过程中,基于当前条件触发相应的思维链。
示例:
plaintext4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。
确保在切换思维链时,前一个思维链的决策不会冲突,并且能够顺畅过渡到新的思维链。
实现方法:
根据战场的实时变化,动态调整思维链的选择和内容,确保决策的实时性和准确性。
实现方法:
以下是优化后的提示词示例,展示了如何在提示词中整合多思维链,实现灵活切换和提高推理结果。
plaintext### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: True/False - IS_ZERG_HIGH_GROUND: True/False - IS_ZERG_NEAR_COLOSSUS: True/False - IS_COLOSSUS_AT_CLIFF: True/False - IS_COLOSSUS_WEAPON_COOLDOWN: True/False - IS_COLOSSUS_HEALTH_LOW: True/False 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 5.2 **战略思维链(Strategic CoT)** - Step 1: 确定初始位置,优先向右移动以避开左侧路径。 - Step 2: 规划撤退路线,利用悬崖延迟敌军追击。 - Step 3: 监控敌军动向,调整位置以保持视野和攻击优势。 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 7. 思维链协调与反馈 7.1 状态保持与重置 - 在切换思维链时,保持必要的状态信息,重置不相关的状态。 7.2 反馈机制 - 根据执行结果,调整后续决策路径,优化策略效果。 8. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
将不同的思维链模块化,使其在特定情境下能够独立工作,同时保持整体框架的统一性。
实施方法:
通过明确的条件触发机制,指导模型在不同情境下选择合适的思维链进行推理。
实施方法:
使思维链能够根据实时战场信息动态调整决策,提升决策的实时性与准确性。
实施方法:
结合《星际争霸 II》的专业知识,增强思维链的决策能力和战术效果。
实施方法:
通过具体战斗案例分析,优化思维链的决策逻辑,提高其在实际战场中的表现。
实施方法:
基于上述多思维链的优化建议,以下是改进后的论文撰写思路,重点在于方法与提示词内容的优化,特别是针对思维链的调整和多思维链的实现。
3.1 Visual-CoT 框架优化
3.2 思维链内容优化
3.3 领域特定启发式规则整合
3.4 案例驱动的思维链优化
通过在提示词中引入多思维链(Multi-CoT),并实现灵活切换,可以显著提升模型在复杂决策任务中的推理深度与决策质量。结合《星际争霸 II》的深层策略知识,优化思维链的结构与内容,整合领域特定的启发式规则,能够有效应对1巨像对抗32小狗的复杂战场情境。同时,通过案例驱动的思维链优化和动态适应机制,确保模型的决策具有高度的灵活性和适应性。
这种多思维链的方法不仅适用于《星际争霸 II》,还可以扩展至其他需要多层次决策和策略优化的复杂任务中,如机器人控制、自动驾驶和虚拟助手等领域。未来,随着技术的不断进步,多思维链的应用将进一步推动基于大型语言模型的复杂决策方法的发展,提升其在各种实际应用中的表现与效能。
如果您有进一步的需求或具体问题,欢迎继续交流,我将为您提供更详细的支持和建议。
我希望的输出是,输出的攻击和移动指令能够有优先级排序;如果移动距离较远的话,可以中间插入空动作,<No_Operation()>来使用。
感谢您提供的进一步需求。为了增强 Visual-CoT 框架在 1 巨像对抗 32 小狗 场景中的表现,以下是针对动作优先级排序和**插入空动作(<No_Operation()>)**的详细优化建议。
动作优先级排序:
插入空动作(<No_Operation()>):
优先级等级:
优先级分配原则:
以下是整合了动作优先级排序和插入空动作的优化提示词示例:
plaintext### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: True/False - IS_ZERG_HIGH_GROUND: True/False - IS_ZERG_NEAR_COLOSSUS: True/False - IS_COLOSSUS_AT_CLIFF: True/False - IS_COLOSSUS_WEAPON_COOLDOWN: True/False - IS_COLOSSUS_HEALTH_LOW: True/False - MOVE_DISTANCE: 数值(计算移动距离) 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 5.2 **战略思维链(Strategic CoT)** - Step 1: 确定初始位置,优先向右移动以避开左侧路径。 - Step 2: 规划撤退路线,利用悬崖延迟敌军追击。 - Step 3: 监控敌军动向,调整位置以保持视野和攻击优势。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 - **插入空动作**:如果移动距离超过阈值(例如,MOVE_DISTANCE > 10),则在移动动作之间插入相应数量的 <No_Operation()>。 7. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 移动到低地 <No_Operation()> # 空动作 <Attack_Unit(0x1004c0001)> # 攻击最近的 Zergling <Attack_Unit(0x100740001)> # 攻击另一 Zergling
定义优先级:
实施方法:
定义阈值:
计算插入数量:
Number of <No_Operation()> = ceil(MOVE_DISTANCE / THRESHOLD) - 1
实施方法:
结构化设计:
模块化设计的优势:
以下是优化后的示例输出,展示了动作优先级排序和插入空动作的实现:
plaintext--------------------------------------------------示例输出提示-------------------------------------------------- Analysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True Colossus 位于高地,已通过图像和坐标确认。 - IS_ZERG_HIGH_GROUND = True Zerglings 主要集中在高地,与 Colossus 位于同一区域。 - IS_ZERG_NEAR_COLOSSUS = True 多个 Zerglings 靠近 Colossus,位于攻击范围内(≤7),构成即时威胁。 - IS_COLOSSUS_AT_CLIFF = False Colossus 靠近地图的红色边界,不靠近悬崖。 - IS_COLOSSUS_WEAPON_COOLDOWN = True 武器处于冷却中,无法立即攻击敌军。 - IS_COLOSSUS_HEALTH_LOW = False Colossus 健康状态良好(100%),无低健康威胁。 - MOVE_DISTANCE = 50 1.2 Action Decision Rules Brief Recap - **移动优先情境**: 1. 武器冷却中:移动 > 攻击 2. 即将被包围:优先移动逃离 3. 早期游戏阶段(<6秒):优先向右移动,移动 > 攻击 1.3 Judgment Based on Action Decision Rules - **特殊规则优先**: - 武器冷却中,优先移动以避免被包围。 - **情境匹配**: - 当前情境符合应急思维链(敌人接近,武器冷却中),优先移动。 - **决策**: - 移动 > 攻击 - 立即撤退到低地,利用悬崖延迟追击,等待武器冷却完成。 2. Integrated Strategy and Decision Framework 2.1 Strategy Suggestions **当前战略**:紧急撤退到低地 - Colossus 位于高地,敌军数量众多且接近。 - 快速撤退到低地(灰色区域),利用悬崖障碍延迟 Zerglings 追击,为武器冷却争取时间,确保安全距离。 2.2 Movement Decision Analysis - **敌军进攻路径预测**: - Zerglings 从高地左侧狭窄路径进攻,试图包围 Colossus。 - **移动目标建议**: - 撤退到屏幕坐标 [160, 200],低地悬崖附近的安全区域。 - 移动距离计算:从当前坐标 [150, 123] 到目标坐标 [160, 200],移动距离约为 √((160-150)^2 + (200-123)^2) ≈ 77 - 根据阈值10格,插入 <No_Operation()> 数量 = ceil(77 / 10) - 1 = 8 2.3 Attack Decision Analysis - **目标优先级**: - 武器冷却结束后,优先攻击最近的 Zerglings,减少即时威胁,最大化溅射伤害。 - **攻击目标选择**: - 优先攻击 Tag: 0x1004c0001 和 Tag: 0x100740001,减少直接威胁。 - 撤退至低地后,优先攻击密集的追击 Zerglings 群体。 3. Actions: Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
优先级赋值:
排序机制:
移动距离计算:
插入空动作数量:
Number of <No_Operation()> = ceil(MOVE_DISTANCE / 10) - 1
ceil(77/10) - 1 = 8
个 <No_Operation()>。插入位置:
统一决策层:
状态保持与重置:
战术思维链:
战略思维链:
应急思维链:
为了进一步提升提示词的可维护性和灵活性,可以采用模板化设计,通过占位符和变量动态生成提示词内容。
示例代码(使用 Jinja2 模板引擎):
pythonfrom jinja2 import Template import math # 定义模板 system_prompt_template = """ ### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: {{ IS_HIGH_GROUND }} - IS_ZERG_HIGH_GROUND: {{ IS_ZERG_HIGH_GROUND }} - IS_ZERG_NEAR_COLOSSUS: {{ IS_ZERG_NEAR_COLOSSUS }} - IS_COLOSSUS_AT_CLIFF: {{ IS_COLOSSUS_AT_CLIFF }} - IS_COLOSSUS_WEAPON_COOLDOWN: {{ IS_COLOSSUS_WEAPON_COOLDOWN }} - IS_COLOSSUS_HEALTH_LOW: {{ IS_COLOSSUS_HEALTH_LOW }} - MOVE_DISTANCE: {{ MOVE_DISTANCE }} 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 5.2 **战略思维链(Strategic CoT)** - Step 1: 确定初始位置,优先向右移动以避开左侧路径。 - Step 2: 规划撤退路线,利用悬崖延迟敌军追击。 - Step 3: 监控敌军动向,调整位置以保持视野和攻击优势。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 - **插入空动作**:如果移动距离超过阈值(如 MOVE_DISTANCE > 10),则在移动动作之间插入相应数量的 <No_Operation()>。 7. 思维链协调与反馈 7.1 状态保持与重置 - 在切换思维链时,保持必要的状态信息,重置不相关的状态。 7.2 反馈机制 - 根据执行结果,调整后续决策路径,优化策略效果。 8. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 {% for _ in range(no_operations) %} <No_Operation()> # 空动作 {% endfor %} <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。 """ # 示例填充参数 filled_system_prompt = Template(system_prompt_template).render( IS_HIGH_GROUND=True, IS_ZERG_HIGH_GROUND=True, IS_ZERG_NEAR_COLOSSUS=True, IS_COLOSSUS_AT_CLIFF=False, IS_COLOSSUS_WEAPON_COOLDOWN=True, IS_COLOSSUS_HEALTH_LOW=False, MOVE_DISTANCE=50, no_operations=math.ceil(50 / 10) - 1 # 50/10=5, 5-1=4 ) print(filled_system_prompt)
输出示例:
plaintext### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: True - IS_ZERG_HIGH_GROUND: True - IS_ZERG_NEAR_COLOSSUS: True - IS_COLOSSUS_AT_CLIFF: False - IS_COLOSSUS_WEAPON_COOLDOWN: True - IS_COLOSSUS_HEALTH_LOW: False - MOVE_DISTANCE: 50 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `Time < 6s`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 5.2 **战略思维链(Strategic CoT)** - Step 1: 确定初始位置,优先向右移动以避开左侧路径。 - Step 2: 规划撤退路线,利用悬崖延迟敌军追击。 - Step 3: 监控敌军动向,调整位置以保持视野和攻击优势。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。 - Step 4: 分配动作优先级。 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 - **插入空动作**:如果移动距离超过阈值(如 MOVE_DISTANCE > 10),则在移动动作之间插入相应数量的 <No_Operation()>。 7. 思维链协调与反馈 7.1 状态保持与重置 - 在切换思维链时,保持必要的状态信息,重置不相关的状态。 7.2 反馈机制 - 根据执行结果,调整后续决策路径,优化策略效果。 8. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
基于上述优化建议,以下是改进后的论文撰写思路,重点在于方法与提示词内容的优化,特别是针对多思维链的实现与动作优先级排序的整合。
3.1 Visual-CoT 框架优化
3.2 思维链内容优化
3.3 领域特定启发式规则整合
3.4 案例驱动的思维链优化
通过在提示词中引入多思维链(Multi-CoT),并实现动作优先级排序与插入空动作(<No_Operation()>),可以显著提升模型在复杂决策任务中的推理深度与决策质量。结合《星际争霸 II》的深层策略知识,优化思维链的结构与内容,整合领域特定的启发式规则,能够有效应对1巨像对抗32小狗的复杂战场情境。
这种多思维链的方法不仅适用于《星际争霸 II》,还可以扩展至其他需要多层次决策和策略优化的复杂任务中,如机器人控制、自动驾驶和虚拟助手等领域。未来,随着技术的不断进步,多思维链的应用将进一步推动基于大型语言模型的复杂决策方法的发展,提升其在各种实际应用中的表现与效能。
如果您有进一步的需求或具体问题,欢迎继续交流,我将为您提供更详细的支持和建议。
就是需要实现这样的效果, 同时还要有多思维链在其中,请你判断可行性以及给出具体思路。
4.2 Action Decision Rules
text- Special Scenarios: rule1 :As long as the weapon is in cooldown, the priority of movement actions is higher than attack actions. Movement > Attack. rule2:When the Colossus is about to be surrounded by enemies, regardless of the weapon's status, it should prioritize moving to escape the encirclement. rule3:Early Game Phase (<6s), at the start of the game, the Colossus should prioritize moving to the right to increase the distance from the narrow pathway on the left and wait for all Zerglings to reach the high ground. During this phase, movement takes priority over attacking. - Situation 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should immediately attack enemies, leveraging its ranged advantage to maximize damage. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should create distance from the enemies to avoid being approached and attack once the weapon cooldown ends. - Situation 2: The Colossus and Zerglings are in the same area - Enemies are near and about to attack (IS_ZERG_NEAR = True, Zerglings approaching): - Priority: Move. Regardless of whether the weapon is on cooldown, the Colossus should immediately retreat, quickly creating distance from the enemies to avoid being surrounded. This ensures it can fully leverage its ranged advantage and avoid the disadvantage of close combat. Once out of danger, it can then look for opportunities to counterattack. - Enemies are not near (IS_ZERG_NEAR = False): - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should take the opportunity to attack enemies and weaken their forces. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should reposition to maintain a safe distance while waiting for the weapon cooldown to end. - Situation 3 :When IS_COLOSSUS_HEALTH_LOW = True: -Priority Strategy: - Avoid direct engagement with Zerglings and maintain maximum attack range. - Utilize terrain advantages and adopt hit-and-run tactics, moving flexibly between high and low ground. - Priority: Movement > Attack
可行性:高
在《星际争霸 II》1巨像对抗32小狗的场景中,实现动作优先级排序与插入空动作(<No_Operation()>)是完全可行的,尤其结合**多思维链(Multi-CoT)**的方法。这一优化不仅能够提升决策的效率和准确性,还能模拟复杂的战术动作执行过程,提高模型在动态战场环境中的适应能力。
主要原因:
为了实现动作优先级排序与插入空动作,同时保持多思维链的结构,以下是详细的实现步骤和思路:
现有思维链:
新增功能:
优先级等级:
优先级分配原则:
移动距离阈值:
插入规则:
ceil(MOVE_DISTANCE / THRESHOLD) - 1
为实现上述功能,需要在提示词中引入动作优先级排序和空动作插入的逻辑。以下是具体的调整思路:
在每个思维链中,添加步骤以评估和排序动作的优先级。
在生成移动动作时,计算移动距离并根据阈值插入空动作。
确保在顶层决策中选择合适的思维链,并在每个思维链内部执行优先级排序和空动作插入。
以下是整合了动作优先级排序和插入空动作的优化提示词示例,展示如何在多思维链框架中实现这些功能。
plaintext### 多思维链决策框架 1. 角色描述 你是 Protoss 前线指挥官,控制一个巨像。主要任务是利用巨像的远程攻击和高生命值优势,抵御并消灭32个 Zerglings 的数量优势,确保巨像的生存。 2. 地图信息 - 坐标系统:[0, 256] × [0, 256],每格32 × 32像素,共8 × 8个网格。 - 地形特征: - 高地(浅绿):上层平台,提供视野和攻击优势。 - 低地(深灰):下层平台,易于防守但缺乏视野。 - 悬崖(浅灰):高低地分界,巨像可自由穿越,Zerglings只能通过左侧狭窄路径进入高地。 - 边界(深红):地图边缘不可通行。 3. 关键胜利点 - 保持最大攻击范围(范围7),利用远程优势。 - 集中火力攻击密集敌军群体,最大化溅射伤害。 - 利用悬崖延迟敌军追击,重新定位和攻击。 4. 决策过程 4.1 条件评估 - IS_HIGH_GROUND: True/False - IS_ZERG_HIGH_GROUND: True/False - IS_ZERG_NEAR_COLOSSUS: True/False - IS_COLOSSUS_AT_CLIFF: True/False - IS_COLOSSUS_WEAPON_COOLDOWN: True/False - IS_COLOSSUS_HEALTH_LOW: True/False - MOVE_DISTANCE: 数值(计算移动距离) - TIME: 游戏时间(秒) 4.2 多思维链选择 - **战术思维链(Tactical CoT)**: - 适用情境:敌军接近且武器未冷却。 - **战略思维链(Strategic CoT)**: - 适用情境:早期游戏阶段,需布局和位置选择。 - **应急思维链(Emergency CoT)**: - 适用情境:武器冷却中或被包围,需迅速脱离危险。 4.3 思维链触发规则 - 如果 `IS_COLOSSUS_HEALTH_LOW = True` 或 `IS_ZERG_NEAR_COLOSSUS = True`,触发 **应急思维链**。 - 如果 `TIME < 6`,触发 **战略思维链**。 - 否则,触发 **战术思维链**。 5. 思维链内容 5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 - Step 4: 分配动作优先级。 - 高优先级:紧急撤退、避免被包围。 - 中优先级:攻击密集敌军、保持攻击范围。 - 低优先级:非紧急移动、准备攻击。 - Step 5: 生成优先级排序的动作列表。 - Step 6: 根据 MOVE_DISTANCE 插入 <No_Operation()>。 - 如果 MOVE_DISTANCE > 10,则插入 `ceil(MOVE_DISTANCE / 10) - 1` 个 <No_Operation()>。 5.2 **战略思维链(Strategic CoT)** - Step 1: 确定初始位置,优先向右移动以避开左侧路径。 - Step 2: 规划撤退路线,利用悬崖延迟敌军追击。 - Step 3: 监控敌军动向,调整位置以保持视野和攻击优势。 - Step 4: 分配动作优先级。 - 高优先级:战略位置调整、撤退路线规划。 - 中优先级:监控敌军动向。 - 低优先级:非紧急移动。 - Step 5: 生成优先级排序的动作列表。 - Step 6: 根据 MOVE_DISTANCE 插入 <No_Operation()>。 - 如果 MOVE_DISTANCE > 10,则插入 `ceil(MOVE_DISTANCE / 10) - 1` 个 <No_Operation()>。 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。 - Step 4: 分配动作优先级。 - 高优先级:紧急撤退、避免被包围。 - 中优先级:保持安全位置。 - 低优先级:准备反击。 - Step 5: 生成优先级排序的动作列表。 - Step 6: 根据 MOVE_DISTANCE 插入 <No_Operation()>。 - 如果 MOVE_DISTANCE > 10,则插入 `ceil(MOVE_DISTANCE / 10) - 1` 个 <No_Operation()>。 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据触发的思维链,执行相应的战术或战略行动。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 - **插入空动作**:如果移动距离超过阈值(如 MOVE_DISTANCE > 10),则在移动动作之间插入相应数量的 <No_Operation()>。 7. 思维链协调与反馈 7.1 状态保持与重置 - 在切换思维链时,保持必要的状态信息,重置不相关的状态。 7.2 反馈机制 - 根据执行结果,调整后续决策路径,优化策略效果。 8. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
顶层决策流程:
动作优先级排序:
插入空动作:
具体步骤示例:
plaintext5.1 **战术思维链(Tactical CoT)** - Step 1: 评估敌军位置和数量。 - Step 2: 确定最佳攻击目标,优先攻击密集群体。 - Step 3: 选择移动路径,保持在最大攻击范围内。 - Step 4: 分配动作优先级。 - 高优先级:紧急撤退 - 中优先级:攻击密集敌军 - 低优先级:非紧急移动 - Step 5: 生成优先级排序的动作列表。 - 例如:[Attack, Move] - Step 6: 根据 MOVE_DISTANCE 插入 <No_Operation()>。 - MOVE_DISTANCE = 50 - Threshold = 10 - Number of <No_Operation()> = ceil(50 / 10) - 1 = 4 - 生成动作列表:[Move, <No_Operation()>, <No_Operation()>, <No_Operation()>, <No_Operation()>, Attack]
战术思维链(Tactical CoT):
战略思维链(Strategic CoT):
应急思维链(Emergency CoT):
示例代码:
python# 示例动作列表,每个动作是一个元组 (优先级, 动作) actions = [ (2, "<Attack_Unit(0x1004c0001)>"), (1, "<Move_Screen([160, 200])>") ] # 按优先级排序 sorted_actions = sorted(actions, key=lambda x: x[0]) # 提取动作指令 sorted_action_commands = [action[1] for action in sorted_actions]
MOVE_DISTANCE = sqrt((x2 - x1)^2 + (y2 - y1)^2)
MOVE_DISTANCE > THRESHOLD
,则插入 ceil(MOVE_DISTANCE / THRESHOLD) - 1
个 <No_Operation()>。pythonimport math def calculate_move_distance(current_pos, target_pos): return math.sqrt((target_pos[0] - current_pos[0])**2 + (target_pos[1] - current_pos[1])**2) def insert_no_operations(move_distance, threshold=10): return math.ceil(move_distance / threshold) - 1 current_position = [150, 123] target_position = [160, 200] move_distance = calculate_move_distance(current_position, target_position) no_operations = insert_no_operations(move_distance, threshold=10) # 生成动作列表 actions = ["<Move_Screen([160, 200])>"] + ["<No_Operation()>"] * no_operations + ["<Attack_Unit(0x1004c0001)>", "<Attack_Unit(0x100740001)>"]
以下是一个具体的示例,展示优化后的提示词如何生成按优先级排序的动作列表,并插入相应数量的 <No_Operation()>。
示例条件评估:
触发规则:
IS_ZERG_NEAR_COLOSSUS = True
,触发 应急思维链(Emergency CoT)应急思维链内容:
动作生成:
移动距离插入空动作:
最终动作列表:
优化后的示例输出:
plaintext--------------------------------------------------示例输出提示-------------------------------------------------- Analysis: 1. Decision Process 1.1 Condition Assessment - IS_COLOSSUS_HIGH_GROUND = True Colossus 位于高地,已通过图像和坐标确认。 - IS_ZERG_HIGH_GROUND = True Zerglings 主要集中在高地,与 Colossus 位于同一区域。 - IS_ZERG_NEAR_COLOSSUS = True 多个 Zerglings 靠近 Colossus,位于攻击范围内(≤7),构成即时威胁。 - IS_COLOSSUS_AT_CLIFF = False Colossus 靠近地图的红色边界,不靠近悬崖。 - IS_COLOSSUS_WEAPON_COOLDOWN = True 武器处于冷却中,无法立即攻击敌军。 - IS_COLOSSUS_HEALTH_LOW = False Colossus 健康状态良好(100%),无低健康威胁。 - MOVE_DISTANCE = 50 - TIME = 10 4.2 多思维链选择 - 当前条件触发 **应急思维链(Emergency CoT)**。 4.3 思维链触发规则 - 触发应急思维链,因为 `IS_ZERG_NEAR_COLOSSUS = True`。 5. 思维链内容 5.3 **应急思维链(Emergency CoT)** - Step 1: 立即评估周围威胁,确定最安全的撤退方向。 - Step 2: 优先移动到低地或安全区域,避免被包围。 - Step 3: 在安全位置等待武器冷却结束,准备反击。 - Step 4: 分配动作优先级。 - <Move_Screen([160, 200])> # 优先级1 - <Attack_Unit(0x1004c0001)> # 优先级3 - <Attack_Unit(0x100740001)> # 优先级3 - Step 5: 生成优先级排序的动作列表,并根据移动距离插入 <No_Operation()>。 - 生成动作列表: 1. <Move_Screen([160, 200])> # 优先级1 2. <No_Operation()> # 空动作 3. <No_Operation()> # 空动作 4. <No_Operation()> # 空动作 5. <No_Operation()> # 空动作 6. <Attack_Unit(0x1004c0001)> # 优先级3 7. <Attack_Unit(0x100740001)> # 优先级3 6. 综合战略与决策框架 6.1 战略建议 - **当前战略**:根据应急思维链,执行紧急撤退到低地,避免被包围。 - **动态调整**:根据实时战场信息,随时切换思维链,确保策略的适应性和灵活性。 6.2 动作生成与执行 - **动作优先级排序**:根据当前思维链的决策,排序生成的动作。 - **动作多样性**:避免重复单一动作,确保策略多样化。 - **动作可执行性检查**:确保生成的动作符合游戏机制,避免冲突和无效指令。 - **插入空动作**:由于 MOVE_DISTANCE = 50 > 10,插入4个 <No_Operation()>。 7. 思维链协调与反馈 7.1 状态保持与重置 - 在切换思维链时,保持必要的状态信息,重置不相关的状态。 7.2 反馈机制 - 根据执行结果,调整后续决策路径,优化策略效果。 8. 动作输出 Team Colossus-1: <Move_Screen([160, 200])> # 撤退到低地,避免被包围,利用悬崖延迟追击。 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <No_Operation()> # 空动作 <Attack_Unit(0x1004c0001)> # 武器冷却结束后,优先攻击最近的 Zergling。 <Attack_Unit(0x100740001)> # 继续攻击其他附近的 Zergling,利用溅射伤害削弱敌军。
统一决策层:
IS_ZERG_NEAR_COLOSSUS = True
时,优先触发应急思维链。状态保持与重置:
优先级赋值:
排序机制:
插入空动作:
确保动作合法:
避免动作冲突:
战场反馈:
持续优化:
基于上述优化思路,以下是改进后的论文撰写思路,重点在于方法与提示词内容的优化,特别是多思维链的实现以及动作优先级排序与插入空动作的整合。
3.1 Visual-CoT 框架优化
3.2 思维链内容优化
3.3 领域特定启发式规则整合
3.4 案例驱动的思维链优化
通过在多思维链(Multi-CoT)框架中引入动作优先级排序和插入空动作(<No_Operation()>),可以显著提升模型在《星际争霸 II》1巨像对抗32小狗场景中的决策效率与战略执行能力。这种优化不仅增强了模型的战术深度和灵活性,还提高了动作执行的连贯性和合理性。
关键优化点包括:
这些优化将帮助模型在复杂战场环境中做出更高效、精准的决策,确保巨像在对抗大量 Zerglings时的生存与胜利。同时,这一方法论具有较强的通用性,可以扩展至其他需要多层次决策和策略优化的复杂任务中,如机器人控制、自动驾驶和虚拟助手等领域。
如果您有进一步的需求或具体问题,欢迎继续交流,我将为您提供更详细的支持和建议。
下面给出一个以“思维链投票机制 + 事件提醒 + 提示词优化”为核心的改进方案示例,省去对“短期记忆”机制的大规模改动,重点在图像信息和Prompt层面完善。可供下一步研发和论文撰写时参考:
思维链投票机制
事件提醒(Event Reminder)
提示词优化(Prompt Engineering)
以下给出一个从数据流到最终决策输出的技术流程示例,结合以上三大改进点逐步展开。
环境交互
图像预处理
多条推理路径
投票/打分
执行与记录
关键事件定义
自动检测
提示给大模型
text# Event Reminder: - EncirclementRisk = True - LowHealthColossus = False ...
在现有的 Visual-CoT 提示词基础上,可进一步加强以下几点:
“多候选决策”示例
事件信息优先级
text1.0 Check Event Reminders - If any are true, incorporate into chain-of-thought analysis at top priority. 1.1 Condition Assessment ...
图像信息分段引导
动作序列化输出
方法介绍
实验设计
结果分析
局限与扩展
技术路径:
总体收益:
下面给出一个示例“框架图”,以文本/ASCII 形式描述思维链投票机制、事件提醒、以及提示词优化三者的交互流程。实际论文中可将其绘制成更美观的流程示意图。以下示例仅作参考,核心思路与模块关系可以根据实际需求灵活调整。
plaintext┌───────────────────────────────────────────────────────────┐ │ StarCraft II 环境(PySC2) │ │ (提供单位坐标、血量、地图截图等信息) │ └───────────────────────────────────────────────────────────┘ │ │ ▲ │ (1)数据获取 │ │ ▼ │ │ ┌───────────────────────────────────────────────────────────┐ │ 图像 & 事件检测模块 │ │ 1) 预处理图像:标注高低地形、不可通行区域、敌我单位等 │ │ 2) 事件提醒:检测包围风险、血量告急、冷却结束等 │ │ 3) 输出结构化信息 & Event Reminders │ └───────────────────────────────────────────────────────────┘ │ │ │ (2)注入事件提醒 + 结构化视觉信息 ▼ ┌───────────────────────────────────────────────────────────┐ │ Prompt Engineering 模块 │ │ 1) 整合事件提醒(Event Reminders)至高优先级位置 │ │ 2) 加入图像分块描述、高地/低地标注、敌军坐标等 │ │ 3) 构造“多候选决策”示例,引导大模型输出多条备选方案 │ │ 4) 格式化思维链,让大模型并行生成 N 条思维链 │ │ 5) 引导“投票/打分”步骤,产出最优方案 │ └───────────────────────────────────────────────────────────┘ │ │ (3)提示词(包含视觉/事件信息、多思维链投票指令) ▼ ┌───────────────────────────────────────────────────────────┐ │ 大模型(多模态 LLM / Visual-CoT) │ │ 1) 多思维链并行推理:同一时刻生成 N 条决策方案 │ │ 2) 针对事件提醒信息做优先级决策:如包围风险 → 先撤退 │ │ 3) CoT 投票或打分:比较不同思维链的优劣,选出最优 │ │ 4) 产生最终动作序列:可含 <Move>, <Attack>, <No_Operation>, │ 并在必要时插入空动作,或提供简要自解释 │ └───────────────────────────────────────────────────────────┘ │ │ (4)输出动作序列(或最优方案) ▼ ┌───────────────────────────────────────────────────────────┐ │ 执行动作并回写至 StarCraft II 环境 │ │ (如 <Move_Screen([160, 200])>, <Attack_Unit(...)>) │ │ 下一时刻继续观察 → 触发新一轮决策循环 │ └───────────────────────────────────────────────────────────┘
StarCraft II 环境(PySC2)
图像 & 事件检测模块
EncirclementRisk
, LowHealthColossus
, 等)。Prompt Engineering 模块
大模型(多模态 LLM / Visual-CoT)
执行 & 回写
这张框架图可以帮助读者或开发者快速理解**“思维链投票机制 + 事件提醒 + 提示词优化”**如何在现有的 LLM-PySC2 环境中协同工作,既强化了对关键事件的优先处理,也让大模型通过多候选思维链、投票打分机制来增强决策的稳健性和准确度。
方案是:先在系统提示中加入详细规则与机制(多候选投票、事件提醒、图像/地形多步分析),再将无法改动的「环境输入」原样拼接,最后由一个“用户请求”触发大模型一次性输出完整答案。通过这种“三段式”,可以最大化利用 Prompt 工程的灵活性,而不破坏已有的环境逻辑。
请你给出完整的修改后的提示词,要求中英文两个版本
以下示例展示了一个“三段式”提示词结构的完整中英文示例。该结构分为 (1) 系统提示 (System Prompt)、(2) 环境输入 (Environment Data)、以及 (3) 用户请求 (User Request) 三部分。目的在于:
plaintext[System Prompt Start] # 中文部分 你是一款多模态大型语言模型,当前需要在星际争霸 II (StarCraft II) 环境下扮演指挥官。 你的目标:根据图像、地形、敌方单位等信息,决策巨像 (Colossus) 的具体行动,以应对若干虫族小狗 (Zerglings) 的进攻。 在此提示词中,我们会使用三段式结构: 1) 系统提示(规则与机制) 2) 环境输入(不可修改的观测信息) 3) 用户请求(触发你一次性输出完整答案) 请确保你严格遵守以下规则与机制: 1. **多候选投票机制 (Multiple-Candidate Voting Mechanism)** - 当需要做关键决策时,你应同时生成多条“思维链 (Chain of Thought)”或多个候选方案,并在内部进行投票或打分。 - 最终只输出“票数最高或打分最优”的方案,以确保决策的稳健性。 - 在输出中,可以简单展示各方案的投票结果或评分;也可以只保留最优方案的动作序列。 2. **事件提醒 (Event Reminder)** - 你将收到一些关键事件标签,如“包围风险 (EncirclementRisk) = True”、“巨像血量告急 (LowHealthColossus) = True”等。 - 如果事件提示为真,应在思维链或决策过程中优先考虑它们。例如: - 若“包围风险”为真,则优先撤退; - 若“血量告急”为真,则尽量避免正面战斗。 3. **图像/地形多步分析 (Multi-step Visual/Terrain Analysis)** - 你会获得处理后的地图或地形信息,包含高地、低地、悬崖位置、敌我单位坐标等。 - 在思维链中,需要根据这些信息进行多步推理: - 判断当前巨像所处位置、高地优势; - 检查小狗的分布; - 分析可行的移动路线或撤退方向。 4. **输出格式 (Output Format)** - 最终答案中须给出动作序列,例如 `<Move_Screen([X, Y])>`, `<Attack_Unit(unit_tag)>`, 或 `<No_Operation()>`。 - 如果需要一次性移动很远距离,可在移动后插入若干 `<No_Operation()>` 以模拟等待或停顿。 - 可以附上一段对最终决策的简要解释(可选)。 - 如果使用多候选方案,请列出各方案及投票/评分过程,并选出最优方案执行。 --- # English Section You are a multimodal Large Language Model serving as a commander in a StarCraft II environment. Your mission is to decide the Colossus’s actions (e.g., moving, attacking, waiting) against multiple Zerglings. We will provide the prompt in three parts: 1) System Prompt (rules and mechanisms) 2) Environment Input (unchangeable observation data) 3) User Request (to trigger your final answer) You must strictly follow the rules and mechanisms below: 1. **Multiple-Candidate Voting Mechanism** - When making critical decisions, generate multiple parallel Chains of Thought or candidate plans. - Conduct an internal vote or scoring process among these candidates. - Finally, output only the plan with the highest vote or best score for robust decision-making. - Optionally, you may briefly show the vote results, or just the best plan’s action list. 2. **Event Reminder** - You may receive some high-priority event flags, such as “EncirclementRisk = True” or “LowHealthColossus = True.” - If these event flags are true, you must prioritize them in your reasoning. For example: - If “EncirclementRisk = True,” prioritize retreat. - If “LowHealthColossus = True,” avoid direct engagement. 3. **Multi-step Visual/Terrain Analysis** - You will be given processed map or terrain data (e.g., high-ground, low-ground, cliff info, unit coordinates). - You must reason step by step about these details: - The Colossus’s current location and whether it’s on high ground; - The Zerglings’ distribution; - Possible movement or retreat routes, etc. 4. **Output Format** - You must provide the final action list, such as `<Move_Screen([X, Y])>`, `<Attack_Unit(unit_tag)>`, or `<No_Operation()>`. - If a movement is very far, insert `<No_Operation()>` actions to simulate a wait or pause in between. - You may include a brief summary of your final decision (optional). - If you produce multiple candidate plans, list each plan’s voting/scoring result, then choose the best plan to execute. [System Prompt End]
说明:此部分包含游戏环境中不可更改的观测信息,直接原样拼接给模型。
下列示例仅供参考,可根据实际需求替换为真实的观察数据(单位位置、地图网格、事件提醒等)。
plaintext[Environment Data Start] # 以下为原始或处理后的观测信息示例 (中英文混排示范): GameTime = 45.2 (秒) EventReminders: - EncirclementRisk = True - LowHealthColossus = False - WeaponCooldownEndingSoon = False Units and Terrain (简化示例): - Colossus at [150, 120] (HighGround) - Zerglings: positions = [[145, 115], [148, 117], [152, 119], ...] - Cliff line between (160, 100) and (160, 200) - Narrow path on the left side [80, 120] to [100, 130] Map size: 256 x 256 # ...可在此添加更多环境数据,如血量、护盾、冷却时间等 [Environment Data End]
说明:这是用户真正发送给大模型的需求,触发一次性输出最终答案。
请注意,中英文合并写在一起,只是示例——在实际工程中,可根据需要拆分成独立语言版本。
plaintext[User Request Start] # 中文部分: 请根据以上系统提示(规则与机制)和环境输入(不可修改的信息),一次性给出你对巨像的行动决策: 1. 先并行考虑多种可能方案,并在思维链中进行投票或打分; 2. 事件提醒“EncirclementRisk = True”需要优先处理; 3. 最终输出最优方案的动作序列,必要时插入 <No_Operation()>,并可附上简要解释。 # English part: Based on the System Prompt (rules/mechanisms) and the Environment Data (unchangeable info) above, please produce your final action decisions for the Colossus: 1. Consider multiple candidate plans in parallel and perform an internal voting/scoring process; 2. Prioritize the event reminder “EncirclementRisk = True;” 3. Output the final action list with <No_Operation()> where appropriate, and optionally provide a brief explanation. [User Request End]
组合顺序
输出示例
plaintextCandidate Plans (Voting): Plan A: Move to [160, 200], then Attack. (Score: 7.8/10) Plan B: Attack first, then Move. (Score: 5.2/10) Voting Result: Plan A wins. Final Action Sequence: <Move_Screen([160, 200])> <No_Operation()> <Attack_Unit(0x1004c0001)> <Attack_Unit(0x1004c0002)> # Explanation: Because EncirclementRisk is True, we retreat first...
灵活扩展
通过上述**中英文三段式(System Prompt + Environment Data + User Request)**示例,可以在不改动现有“环境输入”机制的前提下,最大化利用 Prompt 工程的灵活性:
这样不仅可保障已有环境逻辑不被破坏,也能通过 Prompt 优化来加强大模型在 RTS 场景中的决策能力。祝研究和工程实践顺利!
要求在此基础上进行大幅度修改,D:\Anaconda\envs\starcraftii0828\python.exe "E:\code\llm-pysc2-develop 09242320\llm-pysc2-develop 10092340\llm-pysc2-develop\llm_pysc2\lib\llm_prompt.py"
pygame 2.6.0 (SDL 2.28.4, Python 3.9.19)
Hello from the pygame community. https://www.pygame.org/contribute.html
--------------------------------------------------System Prompt--------------------------------------------------
text1. Role Description You are a Protoss frontline commander, controls several Stalkers. Responsible for providing cover for the main force and restraining enemy forces.. In this mission, your primary objective is to leverage the Colossus's ability to traverse between high and low terrain to counter the numerical advantage of 32 Zerglings. Your top priority is to capitalize on this mobility advantage, maneuvering between the two terrains to delay the Zerglings' approach, engage in ranged attacks on pursuing Zerglings, and ensure the survival of the Colossus. 2.Map Information 2.1 Coordinate System: - The screen coordinate system is a 2D grid with a range of [0, 256] × [0, 256]. - Each grid cell measures 32 × 32 pixels, resulting in a total of 8 × 8 = 64 grid cells. - The origin point (0, 0) is at the top-left corner, while (256, 256) is at the bottom-right corner. 2.2 Terrain Information: - Pathway (connecting the high ground and low ground): The map consists of an upper platform (high ground) and a lower platform (low ground), connected by a narrow pathway on the left side. - High Ground (light green area): The high ground is characterized by light green grass textures and is located on the upper platform. - Low Ground (dark gray area): The low ground is characterized by dark gray concrete textures and is located on the lower platform. - Impassable Area (black background): The edges of the map consist of a black background, which is impassable. The Colossus must avoid entering these areas. - Cliff (light gray vertical surface): The cliff separates the high ground and low ground. The Colossus can freely traverse the cliff, but Zerglings can only access the high ground through the narrow pathway on the left. Only the boundary between the high ground and low ground is considered a cliff; other deep red boundaries are impassable edges of the map. - Boundary: The boundaries between the high ground and the black background, as well as between the low ground and the black background, are considered impassable edges of the map. The boundaries are represented by deep red lines on the map. 3. Key Victory Points: 3.1 Maintain Maximum Attack Range: - Ideally, the Colossus should stay within its maximum attack range (range 7) to fully utilize its long-range attack advantage while minimizing the risk of melee attacks from enemies. 3.2 Focus Fire on Dense Enemy Clusters: - Target the center of the densest enemy groups to maximize splash damage and quickly reduce enemy numbers. 3.3 Cliff Barrier Delay Effect: The Cliff Barrier Delay Effect refers to leveraging the terrain characteristics of cliffs, where enemy units cannot directly cross the cliff, while allied units can freely traverse it. This forces the enemy to take a detour when transitioning between high and low ground, buying time for allied units to reposition, cool down weapons, and strike effectively. - When enemies are heavily concentrated on the high ground: - If allied units are also on the high ground but risk being surrounded, it is recommended to retreat to the bottom-right low ground. This utilizes the cliff as a barrier to delay enemy pursuit, enabling ranged attacks on the enemies while maintaining battlefield control. - When enemies are heavily concentrated on the low ground: - If allied units are also on the low ground but risk being surrounded, it is recommended to retreat to the top-right high ground. The cliff barrier forces enemies to take a detour, providing time for allied units to reposition and counterattack effectively.
4. Decision Process
Based on the observed data of the Colossus and Zerglings, as well as the RGB image of the screen, use the following decision rules to determine action priorities:
text4.1 Condition Assessment Rules: - IS_HIGH_GROUND: - Criteria: If the Colossus is on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by verifying the Colossus's position in the green area via RGB image and observation data. - IS_ZERG_HIGH_GROUND: - Criteria: If the Zerglings are on the green high ground (above the cliff, with a clear boundary from the low ground), set to True; otherwise, set to False. - Basis: Determined by analyzing the Zerglings' positions via RGB image. - IS_ZERG_NEAR: - Criteria: If at least 5 Zerglings are within the attack range (≤7) or attempting to surround the Colossus, set to True; otherwise, set to False. - Basis: Determined through observation data (distance and position) and RGB image analysis. - IS_AT_CLIFF: - Condition: If the Colossus is near the edge of a cliff, the condition is True; otherwise, it is False. It is important to distinguish between a cliff and a boundary. When the Colossus is near the red boundary line, it is not considered a cliff. A cliff is the transitional area between the highland and the lowland. - Determination Basis: This is analyzed through the RGB image and the positional information of the Colossus. - IS_COLOSSUS_WEAPON_COOLDOWN: - Condition: If the Colossus's weapon cooldown time is greater than 0, the condition is True; otherwise, it is False. - Determination Basis: This is determined by analyzing the Colossus's weapon cooldown time. - IS_COLOSSUS_HEALTH_LOW: - Condition: If the Colossus's health is below 60%, this condition is True; otherwise, it is False. - Determination Basis: Determined by observing the Colossus's health. 4.2 Action Decision Rules - Special Scenarios: rule1 :As long as the weapon is in cooldown, the priority of movement actions is higher than attack actions. Movement > Attack. rule2:When the Colossus is about to be surrounded by enemies, regardless of the weapon's status, it should prioritize moving to escape the encirclement. rule3:Early Game Phase (<6s), at the start of the game, the Colossus should prioritize moving to the right to increase the distance from the narrow pathway on the left and wait for all Zerglings to reach the high ground. During this phase, movement takes priority over attacking. - Situation 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should immediately attack enemies, leveraging its ranged advantage to maximize damage. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should create distance from the enemies to avoid being approached and attack once the weapon cooldown ends. - Situation 2: The Colossus and Zerglings are in the same area - Enemies are near and about to attack (IS_ZERG_NEAR = True, Zerglings approaching): - Priority: Move. Regardless of whether the weapon is on cooldown, the Colossus should immediately retreat, quickly creating distance from the enemies to avoid being surrounded. This ensures it can fully leverage its ranged advantage and avoid the disadvantage of close combat. Once out of danger, it can then look for opportunities to counterattack. - Enemies are not near (IS_ZERG_NEAR = False): - Weapon not on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = False): - Priority: Attack. The Colossus should take the opportunity to attack enemies and weaken their forces. - Weapon on cooldown (IS_COLOSSUS_WEAPON_COOLDOWN = True): - Priority: Move. The Colossus should reposition to maintain a safe distance while waiting for the weapon cooldown to end. - Situation 3 :When IS_COLOSSUS_HEALTH_LOW = True: -Priority Strategy: - Avoid direct engagement with Zerglings and maintain maximum attack range. - Utilize terrain advantages and adopt hit-and-run tactics, moving flexibly between high and low ground. - Priority: Movement > Attack
5 Integrated Strategy and Decision Framework
1. Strategy Suggestions
text1.1 Common Game Strategies - At the start of the game, the Colossus should first move to the right, keeping its Y-coordinate stable while gradually increasing its X-coordinate. This allows the Colossus to maintain distance from the Zerglings climbing up to the high ground and remain positioned on the high ground to wait for most Zerglings to pass through the narrow pathway on the left. This maneuver prevents the Colossus from descending to the low ground too early, avoiding the risk of being surrounded by Zerglings that have not yet reached the high ground. - Once most enemy units have entered the high ground, the Colossus should increase its Y-coordinate to retreat to the low ground via the cliff, avoiding encirclement by Zerglings on the high ground. 1.2 Emergency Response Strategies: - If encircled on the high ground, the Colossus should drastically increase the Y-coordinate to quickly escape the high ground and retreat to the low ground. - If encircled on the low ground, the Colossus should drastically decrease the Y-coordinate to quickly escape the low ground and retreat to the high ground. 2. Movement and Attack Decision 2.1 When Movement Is Required: - Phase Judgment: - Determine the current phase based on game time and the positions of both allied and enemy units. - Evaluate Priorities: - Decide if movement takes priority over attacking (e.g., prioritize movement when the weapon is on cooldown or there is high risk). - Analyze Enemy Movements: - Observe the Zerglings’ movements to predict their attack paths. - Use the RGB image and the Cliff Barrier Delay Effect to choose the safest movement path (e.g., move down-right or up-right). - Analyze Movement Along X and Y Axes: - X-Axis Movement: - Increase X value (move right): Create distance from the left-side pathway and maintain horizontal spacing from Zerglings. - Decrease X value (move left): Rarely recommended unless repositioning to a strategic location. - Y-Axis Movement: - Decrease Y value (move upward): Retreat from the low ground to the high ground, leveraging the cliff to block enemies. - Increase Y value (move downward): Retreat from the high ground to the low ground to delay enemy advances. 2.2 When Attacking Is Required: - Target Selection: - Prioritize the densest groups of enemies to maximize splash damage. - Focus fire on high-health enemy units that pose the greatest threat to the Colossus. - Determine the Best Attack Point: - Aim for the densest clusters of Zerglings to reduce their numbers and weaken their combat strength.
6.Analysis:
In your response, please follow the format outlined in the example output, including section titles, numbering, and the terminology used. Ensure that you use the same variable names and terms specified in the decision process and chain of thought.
textThe output should include: 1. Decision Process 1.1 Condition Assessment: - `IS_COLOSSUS_HIGH_GROUND = True/False` - `IS_ZERGLING_HIGH_GROUND = True/False` - `IS_ZERGLING_NEAR_COLOSSUS = True/False` - `IS_COLOSSUS_AT_CLIFF = True/False` - `IS_COLOSSUS_WEAPON_COOLDOWN = True/False` - `IS_COLOSSUS_HEALTH_LOW = True/False' 1.2 Action Decision Rules Brief Recap - Special Scenario: - When the weapon is on cooldown: Move > Attack - When encirclement is imminent, prioritize movement to escape, regardless of weapon cooldown status. - Early game (<6s), prioritize moving right to distance from the left pathway and wait for Zerglings on the high ground; movement > attack. - Scenario 1: The Colossus and Zerglings are in different areas - Weapon not on cooldown: Prioritize attacking, leveraging range advantage to maximize damage. - Weapon on cooldown: Prioritize moving to create distance, then attack after cooldown ends. - Scenario 2: The Colossus and Zerglings are in the same area - Enemies are close: Prioritize moving to avoid melee combat, then counterattack. - Enemies are not close: - Weapon not on cooldown: Prioritize attacking to weaken enemy forces. - Weapon on cooldown: Prioritize moving to maintain a safe distance. - Scenario 3: The Colossus is low on health (IS_COLOSSUS_HEALTH_LOW = True) - Enemies are close: - Prioritize moving to avoid direct engagement, leveraging terrain to create distance and delay pursuit. - Once a safe position is reached, counterattack cautiously while maintaining maximum attack range. - Enemies are not close: - Weapon not on cooldown: Focus on targeting the densest enemy clusters from a safe position to reduce their numbers. - Weapon on cooldown: Continue repositioning to maintain distance and avoid being flanked. 1.3 Judgment Based on 4.2 Action Decision Rules: -Provide movement and attack priorities, e.g., attack > move or move > attack. -Note: -Special rules, Scenario 1, Scenario 2, or Scenario 3 should align with only one scenario that best fits the current situation. -Always prioritize special rules (such as weapon cooldown or imminent encirclement). If none apply, proceed to select an action rule based on Scenario 1, Scenario 2, or Scenario 3. 2 Integrated Strategy and Decision Framework 2.1 Strategy Suggestions - Current Phase Determination: Dynamically determine the strategy based on time, the Colossus’s position, and Zergling distribution. Key factors include time range, the Colossus’s X/Y coordinates, and enemy movement trends. - Consider Common Action Recommendations:: - Early Game Observation Phase: Initially, keep the X-coordinate steady while gradually increasing the Y-coordinate to monitor enemy movements. Utilize the Colossus’s range advantage to weaken enemy forces. - Emergency Evacuation Phase: - Encircled on High Ground: The Colossus should drastically increase the Y-coordinate to quickly retreat to the low ground and escape the encirclement. - Encircled on Low Ground: The Colossus should drastically decrease the Y-coordinate to swiftly return to the high ground and avoid threats. - Late-Game Kite-and-Pull Phase: In the late game, the Colossus should maneuver flexibly near the cliff, leveraging the terrain to force enemy repositioning. Focus on ranged attacks to weaken enemy forces and maintain control of the battlefield. 2.2 Movement Decision Analysis - Enemy Approach Path Prediction: Analyze the primary approach paths of enemy units and identify potential threats. - Movement Target Suggestions: List multiple candidate movement points, including defensive positions on the high ground, safe retreat points on the low ground, and avoidance points. Specify the most suitable movement target (e.g., a safe spot on the high ground or near the cliff on the low ground). Explain the reasoning behind the selection (e.g., reducing the risk of being surrounded or gaining a better position for attacks). 2.3 Attack Decision Analysis - Target Prioritization: Prioritize attacking dense enemy clusters to maximize splash damage. If threatening units are approaching, prioritize targeting those closest to the Colossus to ensure its safety. - Attack Target Selection: Identify multiple candidate attack points (e.g., the center of a dense enemy group or the closest group to the Colossus). Specify the optimal attack point and provide reasoning (e.g., to maximize damage or avoid being overwhelmed). 7.Action Output: Team Colossus-1: Action steps.
--------------------------------------------------Example Input Prompt--------------------------------------------------
textGame Info Time: 0:05 Team Colossus-1 Info: Team minimap position: [31, 27] Controlled Team Units: Unit: Colossus, Tag: 0x100000001, ScreenPos: [150, 123], Health: 350 (100%), Weapon Cooldown Time: 0.91s Nearby Enemy Units: Unit: Zergling, Tag: 0x1002c0001, ScreenPos: [52, 164], Distance: 10, Health: 6 (17%) Unit: Zergling, Tag: 0x100180001, ScreenPos: [79, 159], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100440001, ScreenPos: [112, 140], Distance: 4, Health: 35 (100%) Unit: Zergling, Tag: 0x100640001, ScreenPos: [38, 176], Distance: 12, Health: 21 (60%) Unit: Zergling, Tag: 0x100700001, ScreenPos: [52, 178], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100780001, ScreenPos: [42, 169], Distance: 11, Health: 6 (17%) Unit: Zergling, Tag: 0x100540001, ScreenPos: [49, 171], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x100500001, ScreenPos: [91, 146], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100800001, ScreenPos: [83, 147], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100280001, ScreenPos: [96, 151], Distance: 6, Health: 21 (60%) Unit: Zergling, Tag: 0x1000c0001, ScreenPos: [59, 162], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x100040001, ScreenPos: [93, 157], Distance: 6, Health: 35 (100%) Unit: Zergling, Tag: 0x100340001, ScreenPos: [67, 172], Distance: 9, Health: 35 (100%) Unit: Zergling, Tag: 0x1001c0001, ScreenPos: [72, 161], Distance: 8, Health: 6 (17%) Unit: Zergling, Tag: 0x100600001, ScreenPos: [76, 152], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x100140001, ScreenPos: [86, 162], Distance: 7, Health: 35 (100%) Unit: Zergling, Tag: 0x1004c0001, ScreenPos: [117, 145], Distance: 3, Health: 35 (100%) Unit: Zergling, Tag: 0x100400001, ScreenPos: [66, 164], Distance: 9, Health: 21 (60%) Unit: Zergling, Tag: 0x100200001, ScreenPos: [78, 166], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100240001, ScreenPos: [104, 144], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100100001, ScreenPos: [59, 172], Distance: 10, Health: 35 (100%) Unit: Zergling, Tag: 0x1003c0001, ScreenPos: [86, 154], Distance: 7, Health: 6 (17%) Unit: Zergling, Tag: 0x100480001, ScreenPos: [46, 159], Distance: 11, Health: 35 (100%) Unit: Zergling, Tag: 0x1006c0001, ScreenPos: [69, 155], Distance: 8, Health: 35 (100%) Unit: Zergling, Tag: 0x100580001, ScreenPos: [59, 154], Distance: 9, Health: 6 (17%) Unit: Zergling, Tag: 0x100740001, ScreenPos: [130, 134], Distance: 2, Health: 35 (100%) Unit: Zergling, Tag: 0x1005c0001, ScreenPos: [97, 142], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100380001, ScreenPos: [39, 183], Distance: 12, Health: 35 (100%) Unit: Zergling, Tag: 0x100080001, ScreenPos: [106, 151], Distance: 5, Health: 35 (100%) Unit: Zergling, Tag: 0x100300001, ScreenPos: [124, 137], Distance: 2, Health: 35 (100%) Relevant Knowledge: Protoss.Colossus The large quad-legged vehicle fires lasers in a splash pattern well-suited to destroying swarms of weaker units. This unit can also traverse differences in terrain height due to its long legs, and will appear to step over ledges and other obstacles due to the inverse kinematics system. Unit properties: ['ground', 'air', 'armored', 'massive', 'mechanical'] Weapon info: Attack Range 7, target: ['ground'], anti: ['light'], DPS(damage per second) 13, DPS-anti 20 Unit abilities: Zerg.Zergling Fast but weak melee attacker ideal for swarming attacks in large numbers. Unit properties: ['ground', 'light', 'biological'] Weapon info: Attack Range 1, target: ['ground'], DPS(damage per second) 10 Valid Actions: Team Colossus-1 Valid Actions: <Attack_Unit(tag)> <Move_Screen(screen)> Action Args: (1) tag: tag refers to a hexadecimal number, shape as 0x000000000. (2) screen: screen refers to a screen coordinate, shape as [x, y], where x and y range from 0 to 256. (3) minimap: minimap refers to a minimap coordinate, shape as [x, y], where x and y range from 0 to 64. For example, when you want to use an action like <Action_Name(tag, screen)>, you should output like <Action_Name(0x100580001, [37, 55])>; when you want to use an action like <Action_Name(screen)>, you should output like <Action_Name([66, 78])>. Make sure to see whether an action uses screen coordinates or minimap coordinates. If an action name is XXXX_Screen, it uses screen coordinates; if an action name is XXXX_Minimap, it uses minimap coordinates. Last Step Actions: Team Colossus-1: <Move_Screen([160, 130])> <Attack_Unit(0x1007c0001)> <Attack_Unit(0x100740001)> You need to confirm whether the previous action finished executing, and based on this, determine whether to continue the old strategy or immediately take other actions. Give each team no more than 2 actions, among which activity release should usually before move and attack. Now, start generating your analysis and actions:
Give each team no more than 3 actions.
Now, start generating your analysis and actions:
--------------------------------------------------Example Output Prompt--------------------------------------------------
Analysis:
Decision Process
1.1 Condition Assessment
text- IS_COLOSSUS_HIGH_GROUND = True The Colossus is currently on the high ground, confirmed via the image and coordinates. - IS_ZERGLING_HIGH_GROUND = True Zerglings are primarily concentrated on the high ground, in the same area as the Colossus. - IS_ZERGLING_NEAR_COLOSSUS = True Multiple Zerglings are close to the Colossus and within its attack range (≤7), posing an immediate threat. - IS_COLOSSUS_AT_CLIFF = False The Colossus is near the red boundary of the map, not near the cliff (the border between green and gray areas). - IS_COLOSSUS_WEAPON_COOLDOWN = True The weapon is on cooldown and cannot immediately attack enemy units. - IS_COLOSSUS_HEALTH_LOW = False The Colossus's health is at 100%, with no low-health threat.
1.2 Action Decision Rules
Decision basis:
1. Special Rules:
- When the weapon is on cooldown, movement takes precedence over attack.
- When surrounded or under close threat, prioritize movement to escape danger.
text2. Scenario Analysis: - Current scenario: The Colossus and Zerglings are in the same area, with Zerglings approaching the Colossus: - The weapon is currently on cooldown, and the large number of fast-approaching Zerglings necessitates movement to avoid being surrounded and to create safe distance. - Terrain condition: - Being close to the red boundary of the map limits the Colossus’s maneuverability, further highlighting the importance of movement.
1.3 Judgment Based on Action Decision Rules
Integrated Strategy and Decision Framework
2.1 Strategy Suggestions
textCurrent Strategy: Emergency Retreat to Low Ground - The Colossus and Zerglings are both on the high ground. Given the large number of enemies and their proximity, the situation is critical. - The Colossus must quickly retreat to the low ground (gray area), using the cliff barrier to delay Zerglings' pursuit. This retreat will buy time for the weapon to cool down and create a safe distance to regain battlefield control.
2.2 Movement Decision Analysis
-Enemy Approach Path Prediction
- Zerglings are advancing from the left pathway on the high ground, attempting to encircle the Colossus.
- If the Colossus does not retreat immediately, it risks being cornered near the red boundary, losing space to maneuver.
Movement Target Suggestions
Reason:
2.3 Attack Decision Analysis
text- Target Prioritization: - Once the weapon cooldown ends, prioritize attacking the nearest Zerglings to reduce immediate threats and maximize splash damage. - Attack Target Selection: - The closest Zerglings (e.g., Zergling, Tag: 0x1004c0001 and 0x100740001) pose a direct threat to the Colossus. - After repositioning on the low ground, prioritize attacking dense clusters of pursuing Zerglings.
3 .Actions:
Team Colossus-1:
<Move_Screen([160, 200])> # Retreat directly from the high ground to the low ground, avoiding Zergling encirclement and using the cliff to delay pursuit.
<Attack_Unit(0x1004c0001)> # After the weapon cooldown ends, prioritize attacking the nearest Zergling to reduce the threat.
<Attack_Unit(0x100740001)> # Continue attacking other nearby Zerglings, using splash damage to weaken the enemy forces.
进程已结束,退出代码0
下面给出一个「三段式」提示词的大幅修改版本,以满足“先在系统提示中加入详细规则与机制(多候选投票、事件提醒、图像/地形多步分析),再将无法改动的环境输入原样拼接,最后用用户请求触发大模型一次性输出完整答案”的需求。以下示例同时给出中英文两种描述方式,供论文写作或研发调试时参考。
plaintext[System Prompt Start] ───────────────────── 中文部分 (Chinese) ───────────────────── 你是一款多模态大型语言模型,当前在 StarCraft II (星际争霸 II) 场景下担任 Protoss 前线指挥官。 本次主要目标:利用可自由跨越高低地形的巨像(Colossus) 对抗 32 条 Zergling 的数量优势。你需要在提示词中综合多候选投票机制(Multiple-Candidate Voting Mechanism)、事件提醒(Event Reminder),并进行图像/地形的多步分析(Multi-step Visual/Terrain Analysis),最终输出最优的动作序列。 为确保不破坏已有的环境逻辑,我们采用“三段式”结构: 1. 系统提示(即本段内容):注入多候选思维链投票机制、事件提醒等; 2. 环境输入(后续会直接原样拼接,不能改动其文本); 3. 用户请求(最后一段,触发一次性输出完整答案)。 请严格遵守以下规则: ### (A) 多候选投票机制 (Multiple-Candidate Voting Mechanism) 1. 当需要做关键决策时,应并行生成多条“思维链 (Chain of Thought)”或多个候选方案。 2. 对这些方案进行投票或打分,选出最优方案。 3. 可以在最终输出中简要展示投票/打分过程,也可只输出最优动作序列。 ### (B) 事件提醒 (Event Reminder) 1. 你可能会收到一些关键事件提示,如「包围风险 (EncirclementRisk) = True」或「巨像血量告急 (LowHealthColossus) = True」。 2. 如果事件提醒为 True,则须在推理时优先处理: - 如果“包围风险”为真,先考虑撤退; - 如果“血量告急”为真,尽量避免正面交战。 ### (C) 图像/地形多步分析 (Multi-step Visual/Terrain Analysis) 1. 你将得到地图/地形相关的信息(如高地、低地、悬崖位置,以及单位坐标)。 2. 在思维链中,需要多步推断: - 当前巨像所处位置是否在高地或低地; - Zergling 分布是否接近; - 是否可利用悬崖穿越来延迟敌方追击等。 ### (D) 动作输出格式 (Action Output Format) 1. 最终回答中应明确给出动作列表,比如 `<Move_Screen([X, Y])>`, `<Attack_Unit(tag)>`, `<No_Operation()>` 等。 2. 如果需要移动较远距离,可在移动后插入若干 `<No_Operation()>` 以模拟时间消耗。 3. 如果在决策中使用多候选方案,可在输出中罗列方案并说明投票结果,或只保留最优方案。 4. 请注意已有的动作定义(如 `_Screen` 表示使用屏幕坐标;`_Unit` 表示攻击单位)需保持一致。 ───────────────────── English Section ───────────────────── You are a multimodal Large Language Model acting as a Protoss frontline commander in a StarCraft II scenario. Your main objective is to utilize the Colossus’s ability to traverse between high and low ground to handle the numerical advantage of 32 Zerglings. You must incorporate a **Multiple-Candidate Voting Mechanism**, an **Event Reminder** approach, and **Multi-step Visual/Terrain Analysis** into your reasoning. Finally, you will produce an optimal action sequence. To preserve the existing environment logic, we adopt a “three-part” structure: 1. **System Prompt** (this section) – injecting the multi-candidate chain-of-thought voting mechanism, event reminders, etc. 2. **Environment Input** (will be concatenated as-is, without modifications) 3. **User Request** (the final part that triggers a single comprehensive answer). Please follow these rules: #### (A) Multiple-Candidate Voting Mechanism 1. When making critical decisions, generate multiple parallel Chains of Thought or candidate plans. 2. Conduct an internal voting or scoring process among these candidate plans. 3. You may optionally showcase the voting results in the final output or just provide the best plan’s action list. #### (B) Event Reminder 1. You may receive special event flags like “EncirclementRisk = True” or “LowHealthColossus = True.” 2. If an event is true, it must be prioritized in your decision-making: - “EncirclementRisk = True” → consider retreat first. - “LowHealthColossus = True” → avoid direct engagement if possible. #### (C) Multi-step Visual/Terrain Analysis 1. You will receive map/terrain info (e.g., high ground, low ground, cliff location, unit coordinates). 2. In your chain-of-thought, reason step by step about: - Whether the Colossus is on high or low ground; - Whether Zerglings are near; - Using the cliff to delay enemy pursuit, etc. #### (D) Action Output Format 1. You must provide a clear list of actions such as `<Move_Screen([X, Y])>`, `<Attack_Unit(tag)>`, or `<No_Operation()>`. 2. If long-distance movement is required, insert `<No_Operation()>` actions to simulate time or delay. 3. If multiple candidate plans are used, you may list them with votes/scores or only show the best plan’s actions. 4. Preserve the existing action definitions and naming conventions (e.g., `_Screen` for screen coordinates, `_Unit` for attacking a target). [System Prompt End]
说明:此段内容是“不可改动的”游戏与单位观测信息,直接原样拼接给模型。以下文本摘自题目中所示,可做示例。你可以根据需要在实际工程里替换为最新的观测数据。以下为原样示例:
plaintext[Environment Input Start] D:\Anaconda\envs\starcraftii0828\python.exe "E:\code\llm-pysc2-develop 09242320\llm-pysc2-develop 10092340\llm-pysc2-develop\llm_pysc2\lib\llm_prompt.py" pygame 2.6.0 (SDL 2.28.4, Python 3.9.19) Hello from the pygame community. https://www.pygame.org/contribute.html --------------------------------------------------System Prompt-------------------------------------------------- 1. Role Description You are a Protoss frontline commander, controls several Stalkers. Responsible for providing cover for the main force and restraining enemy forces.. In this mission, your primary objective is to leverage the Colossus's ability to traverse between high and low terrain to counter the numerical advantage of 32 Zerglings. Your top priority is to capitalize on this mobility advantage, maneuvering between the two terrains to delay the Zerglings' approach, engage in ranged attacks on pursuing Zerglings, and ensure the survival of the Colossus. 2.Map Information ... (以下省略,保持原样拼接) (注:此处省略大量文字,只要与题目中“System Prompt”和“Example Input Prompt/Output Prompt”等部分内容一致即可) Give each team no more than 3 actions. Now, start generating your analysis and actions: [Environment Input End]
说明:这是触发大模型“一次性输出完整答案”的最终请求,可中英混合。示例如下:
plaintext[User Request Start] ───────────────────── 中文部分 (Chinese) ───────────────────── 请基于以上「系统提示 (System Prompt)」中说明的多候选投票机制、事件提醒、以及图像/地形多步分析的方法,结合「环境输入 (Environment Input)」给出的游戏观测数据,一次性输出你的完整决策过程与最终动作。 具体要求: 1. 若需做关键决策,可列出不同候选思维链并进行投票; 2. 如果有包围风险或血量告急之类的事件提醒,请优先处理; 3. 最终给出不超过 3 条动作指令(如 <Move_Screen([X, Y])>, <Attack_Unit(tag)>),并可附少量 <No_Operation()>; 4. 可在输出中简要说明为什么选此方案。 ───────────────────── English Section ───────────────────── Based on the “System Prompt” guidelines (multi-candidate voting, event reminders, multi-step terrain analysis) and the “Environment Input” (unchangeable observation data), please produce a single comprehensive answer detailing your decision-making process and final actions. Requirements: 1. If critical decisions are needed, you may list multiple candidate lines of thought and perform a voting or scoring step. 2. If any event like “EncirclementRisk = True” or “LowHealthColossus = True” is signaled, prioritize it in your reasoning. 3. Provide no more than 3 action commands (e.g. `<Move_Screen([X, Y])>`, `<Attack_Unit(tag)>`) and optionally include some `<No_Operation()>` calls. 4. Optionally explain briefly why the chosen plan is best. [User Request End]
拼接顺序
可扩展性
输出示例
plaintextAnalysis: 1. Candidate Plans: Plan A: ... Plan B: ... Voting: Plan B > Plan A 2. Final Action Sequence: <Move_Screen([160, 200])> <No_Operation()> <Attack_Unit(0x1004c0001)> # Reason: Retreat first, then attack nearest threat...
论文写作
以上即为“三段式”提示词的大幅度修改版本:
整个结构既能让模型深度利用视觉及事件信息,也能保证对环境输入的最小侵入性,方便后续灵活扩展或论文撰写。祝顺利!
请你概括一下各个部分提示词的最新框架以及组成部分。
下面给出一个简明的最新“三段式”Prompt 框架概览,总结了各部分的功能与组成要点:
角色与目标
核心机制
动作输出格式约束
<Move_Screen([X, Y])>
、<Attack_Unit(tag)>
等;<No_Operation()>
;D:\Anaconda\envs\...
,pygame 2.6.0...
)。特点:这部分文本通常是游戏引擎或 PySC2 接口自动生成的原始信息,不做额外修改,直接拼接在提示里让大模型阅读。
通过这种“三段式”结构,可以最大化利用 Prompt 工程的灵活性(如多思维链投票与事件优先级处理),同时保持对已有环境逻辑的最小侵入,适合在论文写作或实际研发中使用。