Description
Hello, I am trying to use Storm to generate Chinese articles.
this is my code:
from knowledge_storm.collaborative_storm.engine import CollaborativeStormLMConfigs, RunnerArgument, CoStormRunner
from knowledge_storm.lm import DeepSeekModel
from knowledge_storm.logging_wrapper import LoggingWrapper
from knowledge_storm.rm import DuckDuckGoSearchRM
import os
# Co-STORM adopts the same multi LM system paradigm as STORM
lm_config: CollaborativeStormLMConfigs = CollaborativeStormLMConfigs()
kwargs = {
"api_key": "xxx",
"api_base": "https://api.siliconflow.cn",
"api_provider": "openai",
"temperature": 1.0,
"top_p": 0.9,
}
model_name = "Qwen/Qwen2.5-7B-Instruct"
question_answering_lm = DeepSeekModel(model=model_name, max_tokens=1000, **kwargs )
discourse_manage_lm = DeepSeekModel(model=model_name, max_tokens=500, **kwargs )
utterance_polishing_lm = DeepSeekModel(model=model_name, max_tokens=2000, **kwargs )
warmstart_outline_gen_lm = DeepSeekModel(model=model_name, max_tokens=500, **kwargs )
question_asking_lm = DeepSeekModel(model=model_name, max_tokens=300, **kwargs )
knowledge_base_lm = DeepSeekModel(model=model_name, max_tokens=1000, **kwargs )
lm_config.set_question_answering_lm(question_answering_lm)
lm_config.set_discourse_manage_lm(discourse_manage_lm)
lm_config.set_utterance_polishing_lm(utterance_polishing_lm)
lm_config.set_warmstart_outline_gen_lm(warmstart_outline_gen_lm)
lm_config.set_question_asking_lm(question_asking_lm)
lm_config.set_knowledge_base_lm(knowledge_base_lm)
# Check out the Co-STORM's RunnerArguments class for more configurations.
topic = input('Topic: ')
runner_argument = RunnerArgument(topic=topic)
logging_wrapper = LoggingWrapper(lm_config)
# bing_rm = BingSearch(bing_search_api_key=os.environ.get("BING_SEARCH_API_KEY"),
# k=runner_argument.retrieve_top_k)
dockdockgo_rm = DuckDuckGoSearchRM(k=runner_argument.retrieve_top_k)
costorm_runner = CoStormRunner(lm_config=lm_config,
runner_argument=runner_argument,
logging_wrapper=logging_wrapper,
rm=dockdockgo_rm,
# rm=bing_rm
)
# Warm start the system to build shared conceptual space between Co-STORM and users
costorm_runner.warm_start()
# Step through the collaborative discourse
# Run either of the code snippets below in any order, as many times as you'd like
# To observe the conversation:
# conv_turn = costorm_runner.step()
# # To inject your utterance to actively steer the conversation:
# costorm_runner.step(user_utterance="我认为人工智能会毁灭世界")
# Generate report based on the collaborative discourse
costorm_runner.knowledge_base.reogranize()
article = costorm_runner.generate_report()
print(' >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ')
print(article)
# topic: Artificial Intelligence Development
# topic:人工智能发展
The problem I encountered is that when I enter Chinese in the topic like “人工智能发展”, the content cannot be generated. But when using "Artificial Intelligence Development", the article can be generated,I debugged the code,The first problem I found was that the Chinese characters were not processed when the string was split, I modified the line 430 of storm/knowledge_storm/collaborative_storm/engine.py
:
role_name, role_description = expert_name.split(":")
to
if ":" in expert_name:
role_name, role_description = expert_name.split(":")
elif ":" in expert_name:
role_name, role_description = expert_name.split(":")
The first problem is solved.
But when I re-run the program and enter the Chinese topic, the program still does not output the article.
I debug the code and found In the last line of the forward
method of the ArticleGenerationModule
class in storm/knowledge_storm/collaborative_storm/modules/article_generation.py
, return "\n".join(to_return)
is executed twice each time the program is executed. When the program executes to this point for the second time, the knowledge_base.root.children
variable is an empty list, which means that the tree is not constructed correctly when the program runs to this point. I'm kind of overwhelmed by the code. I'm not sure how to proceed with the debugging. can you provide some help?
thanks.
Additionally, there's a strange issue: Chinese articles aren't always ungenerable. Out of around 20 attempts, one or two might succeed.