DALL·E 2024-08-05 23.35.33 - A traditional Chinese medicine doctor in a clinical setting.

The large language model for traditional Chinese medicine (TCM)

The large language model for traditional Chinese medicine (TCM) achieves clinical auxiliary diagnosis and treatment (such as syndrome diagnosis and prescription recommendations), as well as TCM knowledge Q&A tasks, through a large model approach. This promotes rapid development in the fields of TCM knowledge Q&A and clinical auxiliary diagnosis and treatment.

Currently, for the task of prescription recommendation in intelligent TCM clinical diagnosis, a fine-tuned large model for TCM prescription recommendation commands has been released.

Hello.

The large language model for traditional Chinese medicine (TCM)

As the demand for traditional Chinese medicine (TCM) increases in response to its use as complementary medicine or preventive medicine in healthcare settings, the diagnosis in TCM, which involves identifying the patient's "syndrome" based on their complaints and symptoms to indicate their physical and mental state, requires extensive knowledge and experience in TCM. To address this, systems for predicting syndromes based on symptoms are being researched to assist physicians in TCM diagnosis. However, patients often have multiple syndromes, and existing research struggles to identify the most critical syndrome to treat or the timing of syndrome onset. Therefore, this study proposes a method for modeling the relationships between syndromes and their onset times using the DEMATEL method and ISM analysis to analyze the strength of relationships between syndromes relevant to the patient, and provide this analysis to physicians.

DEMATEL Method (Decision Making Trial and Evaluation Laboratory)

Overview: The DEMATEL method is a technique used to clarify the causal relationships among elements in a complex system or problem. It is primarily employed to visualize the influence between factors and assist in problem-solving.

Steps:

Selection of Elements: Identify the main elements related to the problem.
Creation of Direct Influence Matrix: Based on the opinions of experts or stakeholders, evaluate how each element affects the others and create a matrix. Typically, a scale from 0 (no influence) to 4 (very strong influence) is used.
Normalization: Normalize the direct influence matrix to adjust the strength of the influences.
Calculation of Total Influence Matrix: By repeatedly multiplying the direct influence matrix, derive the total influence matrix that includes both direct and indirect influences.
Creation of Causal Diagram: Using the total influence matrix, create a graph that shows the causal relationships between elements. This helps identify the most influential and passive elements.

ISM Analysis (Interpretive Structural Modeling)

Overview: ISM analysis is a method used to clarify the hierarchical structure of elements in a complex system or problem. By organizing the interrelationships among elements hierarchically, it helps to understand the essence of the problem.

Steps:

Selection of Elements: Identify the main elements related to the problem.
Evaluation of Interrelationships: Evaluate the relationships between each pair of elements and create an adjacency matrix to determine whether each element influences the others.
Creation of Reachability Matrix: Based on the adjacency matrix, create a reachability matrix that shows whether it is possible to reach each element from another.
Hierarchical Structuring: Analyze the reachability matrix to clarify the hierarchical structure of the elements, organizing them from the lowest to the highest level.
Creation of Structural Model: Based on the hierarchical structure, create a structural model (diagram) that shows the relationships between elements. This visual representation helps to understand the overall picture of the problem and the role of each element.

By combining these methods, it is possible to analyze the relationships and onset times of multiple syndromes relevant to a patient and provide this information to physicians.

Large Model for TCM Prescription Recommendation

This project addresses the task of prescription recommendation in intelligent TCM clinical diagnosis. By integrating real-world clinical records, medical classics, and TCM textbooks, we have constructed a prescription recommendation instruction fine-tuning dataset comprising 68k data entries (a total of 10M tokens). Using this dataset, we performed large-scale instruction fine-tuning on the ChatGLM large model, ultimately resulting in the TCM prescription recommendation large model.

Instruction Fine-Tuning Dataset

Dataset Introduction

The instruction fine-tuning dataset constructed for this project includes data from 8 sources, encompassing 4 classic TCM textbooks: "Internal Medicine of TCM," "Surgery of TCM," "Gynecology of TCM," and "Pediatrics of TCM" (collectively referred to as ISGP), the 2020 edition of the Chinese Pharmacopeia (CHP), Chinese Medicine Clinical Cases (CMCC), as well as clinical medical records from several top-tier hospitals covering various diseases such as lung disease (Lung), stroke (Stroke), diabetes (Diabetes), liver disease (Liver), and splenic and stomach diseases (SSD). The 8 datasets collectively contain 68k entries, with a total token count of approximately 10M (5M input tokens and 4M output tokens).

Using Self-Instruct Approaches with OpenAI API for Professional Domains: Preliminary Exploration

Currently, many models such as Alpaca and Belle rely on the self-instruct approach. This methodology effectively leverages the knowledge of large language models to generate diverse and creative instructions, which is particularly beneficial in typical question-and-answer scenarios, allowing for the rapid construction of a vast number of instructions to achieve instruction optimization. However, in fields with low tolerance for errors, such as medical and legal domains, the generation of hallucinated outputs can lead to noisy instructional data, thereby impacting the model's accuracy. For instance, improper diagnoses and prescription suggestions could jeopardize patients' lives, while factual errors in legal provisions and principles might result in legal defeats for the stakeholders. Therefore, the challenge lies in how to quickly utilize the OpenAI API without compromising the professionalism of the instructional data, which becomes a crucial research direction in the construction and annotation of instructional data.

Our initial experimental exploration aims to address this challenge. Here, we outline our preliminary approach and findings:

Expert-Reviewed Instructions: Instead of solely relying on the self-instruct approach, we integrate expert reviews into the instruction generation process. Experts in the medical and legal fields review and validate the generated instructions to ensure accuracy and relevance.
Contextual Data Integration: We enhance the models by incorporating domain-specific contextual data. This involves feeding the model with curated datasets that include verified medical records, legal documents, and precedents to reduce the likelihood of generating hallucinated outputs.
Adaptive Fine-Tuning: The models undergo adaptive fine-tuning using specialized corpora from medical and legal domains. This fine-tuning process helps the models to better understand the intricacies and nuances of these fields, thereby improving the quality of generated instructions.
Feedback Loop Mechanism: We implement a feedback loop mechanism where the outputs are continuously monitored and reviewed. Any identified errors or inaccuracies are fed back into the model to iteratively improve its performance.
Use of Robust Evaluation Metrics: To evaluate the effectiveness of our approach, we employ robust evaluation metrics that focus on accuracy, relevance, and reliability of the generated instructions. These metrics help in quantitatively assessing the improvements made by integrating expert reviews and contextual data.

Through these methods, our goal is to create a framework that allows for the rapid utilization of the OpenAI API while maintaining the integrity and accuracy of instructions in professional domains. Our preliminary results indicate a promising improvement in reducing hallucinated outputs and enhancing the overall quality of the generated instructional data. Further research and experimentation are ongoing to refine these approaches and ensure their scalability and effectiveness.

1.1 Multi-task Diagnostic Behavior Decomposition Instruction Construction Strategy

Humans require the construction of various scenarios and narratives to implicitly encode knowledge information when it comes to memory and understanding. The clarity of memory depends on the continuous process and richness of learning. Interleaved learning, spaced practice, and diversified learning can enhance the consolidation of knowledge, thereby forming a profound understanding of domain knowledge. Our approach is to draw inspiration from the human memory process, using professional tables and leveraging the language representation capabilities of large language models. We strictly set specific prompt templates, enabling the model to generate outputs based on traditional Chinese medicine (TCM) gynecology prescription data. These outputs include patient treatment stories, diagnostic analysis, expected diagnostic and treatment outcomes, prescription functions, interactive stories, narrative medicine, tongue and pulse diagnosis, treatment plan formulation, critical thinking, follow-ups, prescriptions, drug dosages, case studies, real-world problems, etiology, and pathogenesis across 15 scenarios. This approach aims to enhance the model's reasoning abilities regarding TCM prescription data and diagnostic thinking logic.

Implementation Strategy:

Structured Data Utilization: We use structured TCM gynecology prescription tables to provide the model with detailed and specific data. This structured approach ensures that the information is clear and precise, reducing ambiguity in the model's understanding.
Scenario-based Prompt Templates: We design strict prompt templates for each of the 15 scenarios. These templates guide the model in generating relevant and accurate outputs, ensuring that each scenario is thoroughly covered.
Interleaved Learning and Spaced Practice: By incorporating interleaved learning and spaced practice into the prompt generation process, we mimic the human learning process. This method helps in reinforcing the model's understanding and retention of TCM knowledge.
Diverse Scenario Simulation: The prompts simulate diverse scenarios such as patient treatment stories, diagnostic analysis, and narrative medicine. This diversity ensures that the model can handle a wide range of situations and enhances its flexibility and adaptability.
Critical Thinking and Follow-up: Prompts are designed to encourage critical thinking and detailed follow-up processes. This aspect helps in developing the model's ability to think critically and provide comprehensive diagnostic and treatment plans.
Real-world Problem Solving: Including real-world problems in the scenarios helps in bridging the gap between theoretical knowledge and practical application. This approach ensures that the model's outputs are not only accurate but also practically relevant.
Expert Review and Iterative Improvement: The outputs generated by the model are reviewed by experts in TCM gynecology. Feedback from these reviews is used to iteratively improve the model's performance, ensuring that it meets professional standards.

By implementing these strategies, we aim to create a robust framework that enhances the model's reasoning capabilities and its understanding of TCM prescription data and diagnostic logic. This approach not only improves the accuracy and relevance of the model's outputs but also ensures that it can handle complex and diverse medical scenarios effectively.

Detailed Introduction to the Complete Process of San Yuan Tang

Instruction: I'm very curious about the complete process of San Yuan Tang. Could you provide a detailed introduction from the overview, medical case, symptoms, diagnosis and treatment, and discussion of results?

Input:

Output:

San Yuan Tang is an ancient traditional Chinese medicine prescription, first appearing in the "Bao Ming Ji" (Preserving Life Collection) Volume 2. It consists of various Chinese herbs and is mainly used to treat prolonged postpartum weakness and rapid pulse.

The patient, Ms. Li, suffered from prolonged postpartum weakness and rapid pulse. Based on her medical history and symptoms, I decided to use San Yuan Tang to treat her condition. According to the prescription's composition and dosage, I prepared a decoction of Chai Hu (Bupleurum), Huang Qin (Scutellaria), Ren Shen (Ginseng), Ban Xia (Pinellia), Gan Cao (Licorice), Chuan Xiong (Ligusticum), Shao Yao (Peony), Shu Di Huang (Prepared Rehmannia), and Dang Gui (Angelica). Ms. Li was instructed to take this decoction twice daily.

During the treatment, I also paid attention to Ms. Li's diet and rest, advising her to eat easily digestible foods and ensure adequate sleep and rest. After a month of treatment, Ms. Li's symptoms significantly improved. Her mental state stabilized, her appetite increased, and the dizziness and fatigue reduced considerably.

I believe that the therapeutic effects of San Yuan Tang are primarily due to the synergistic action of the various herbs, which can regulate bodily functions, enhance physical strength, and boost immunity. Additionally, I noted that Ms. Li's diet and rest also contributed to the treatment's effectiveness. Therefore, when using traditional Chinese medicine to treat illnesses, comprehensive patient care, including diet and rest, should also be considered.

1.2 Instruction Data Construction Strategy for Traditional Chinese Medicine Knowledge

In addition, we have included instructions based on the content of ancient Chinese medical texts, definitions of terms, synonyms and antonyms of symptoms, syndromes, symptoms, and treatment methods. To create a controlled experiment, we used only one instruction template for this part to represent the data. The quantity of this data is approximately 80,000 entries, significantly more than the number of instructions constructed using the aforementioned strategies. Below is the specific information regarding the number of instructions and tokens.

Specific Instruction Data and Tokens Information:

Ancient Chinese Medical Texts Content:
- Number of Instructions: ~10,000
- Tokens: Detailed entries derived from classical texts ensuring accurate representation of traditional practices.
Term Definitions:
- Number of Instructions: ~5,000
- Tokens: Clear and concise definitions of TCM terms for better understanding and standardization.
Synonyms and Antonyms of Symptoms:
- Number of Instructions: ~8,000
- Tokens: Comprehensive lists of synonyms and antonyms to enrich the model's vocabulary and understanding of TCM symptoms.
Syndromes:
- Number of Instructions: ~12,000
- Tokens: Descriptions and explanations of various TCM syndromes to enhance diagnostic accuracy.
Symptoms:
- Number of Instructions: ~15,000
- Tokens: Detailed symptom descriptions to improve the model's ability to recognize and interpret patient conditions.
Treatment Methods:
- Number of Instructions: ~30,000
- Tokens: Extensive information on TCM treatment methods, including herbal prescriptions, acupuncture, and dietary recommendations.

By constructing a large volume of instruction data with a single template, we aim to establish a robust foundation of TCM knowledge that can be effectively utilized by the model. This strategy allows us to compare the effectiveness of diverse instructional methods against a standardized dataset, ensuring a comprehensive evaluation of the model's performance in handling traditional Chinese medicine information.

1.2 Instruction Data Construction Strategy for Traditional Chinese Medicine Knowledge

In addition to the strategies mentioned earlier, we have incorporated instructions based on the content of ancient Chinese medical texts, definitions of terms, synonyms and antonyms of symptoms, syndromes, symptoms, and treatment methods. To create a controlled experiment, we used only one instruction template to represent this data. The quantity of this data is approximately 80,000 entries, significantly more than the number of instructions constructed using the aforementioned strategies. Below is the specific information regarding the number of instructions and tokens.

Specific Instruction Data and Tokens Information:

Ancient Chinese Medical Texts Content:
- Number of Instructions: ~10,000
- Tokens: Detailed entries derived from classical texts to ensure accurate representation of traditional practices.
Term Definitions:
- Number of Instructions: ~5,000
- Tokens: Clear and concise definitions of TCM terms for better understanding and standardization.
Synonyms and Antonyms of Symptoms:
- Number of Instructions: ~8,000
- Tokens: Comprehensive lists of synonyms and antonyms to enrich the model's vocabulary and understanding of TCM symptoms.
Syndromes:
- Number of Instructions: ~12,000
- Tokens: Descriptions and explanations of various TCM syndromes to enhance diagnostic accuracy.
Symptoms:
- Number of Instructions: ~15,000
- Tokens: Detailed symptom descriptions to improve the model's ability to recognize and interpret patient conditions.
Treatment Methods:
- Number of Instructions: ~30,000
- Tokens: Extensive information on TCM treatment methods, including herbal prescriptions, acupuncture, and dietary recommendations.

Instruction: Please answer the following questions related to the definitions of traditional Chinese medicine disease terms:

Input: Which category does (Feng Fei) acne belong to?

Output: Acne caused by lung wind, stomach heat, or liver stagnation. It primarily manifests as blackheads or whiteheads, papules, pustules, nodules, cysts, and scars on the face and back.

2. Model Performance Comparison:

Our test data is based on real medical cases from high-level traditional Chinese medicine (TCM) practitioners, typically provincial-level senior TCM doctors or national TCM masters. This ensures a certain level of professionalism. Such data is strictly considered out-of-distribution (OOD) data (both subject-specific OOD and out-of-distribution from the training dataset), differing from traditional training and validation sets.