publications
Please see my full publication list at google scholar.
2024
- A hierarchical deep model integrating economic facts for stock movement predictionYang, Jiahao, Zhang, Ming, Feng, Shuo, Zhang, Xuejun, and Bai, XingEngineering Applications of Artificial Intelligence 2024
Accurate stock movement prediction is essential to profit from the stock market. However, this task is challenging due to the complexity and non-stationary nature of the market. Deep learning methods have obtained more attention and success in mining price movement patterns.However, some limitations affect their performances. In general, the stock market is ever-changing, and many factors affect stock movement, so capturing the stock movement patterns is hard without enough prior information. To tackle it, we consider employing economic facts to help improve the deep learning method. In this paper, we propose a novel Hierarchical Deep learning Model that fuses Economic Facts (HDMEF) to predict stock movement from the micro to the macro tiers: the individual, industry, and whole market tiers. Specifically, we present three well-designed modules to separately model them based on the Capital Asset Pricing Model (CAPM), the herding effects, and the holiday effects in the stock market. Experiments on the A-share CSI300 and CSI500 indexes demonstrate that our proposed method performs best on all test phases compared with previous competitive baselines, even an absolute improvement of 2%–3% on some test phases where all the baselines act poor, proving our method is more efficient and robust in different market conditions. In addition, we do an ablation study to analyze the role of various economic effects used in our model, and the results prove that each module is helpful for prediction.
- From coarse to fine: Enhancing multi-document summarization with multi-granularity relationship-based extractorZhang, Ming, Lu, Jiyu, Yang, Jiahao, Zhou, Jun, Wan, Meilin, and Zhang, XuejunInformation Processing & Management 2024
Multi-Document Summarization (MDS) is a challenging task due to the fact that multiple documents not only have extremely long inputs but may also be overlapping, complementary, or contradictory to each other. In this paper, we propose to capture complex cross-document interactions to handle lengthy inputs for better multi-document summarization. Specifically, we present MDS-MGRE, a coarse-to-fine MDS framework that introduces Multi-Granularity Relationships into an Extract-then-summarize pipeline. In the coarse-grained stage, multi-granularity embedding, heterogeneous graph construction, and MGRExtractor work together to convert redundant multi-documents into compact meta-documents. We first utilize pre-trained language model BERT to obtain semantically rich embeddings for documents at different granularities, including documents, paragraphs, sentence-sets, and sentences. Then, we construct a heterogeneous graph with 4 types of nodes (document nodes, paragraph nodes, sentence-set nodes, and sentence nodes) and corresponding connecting edges to model rich document relationships. Furthermore, we propose a novel Multi-Granularity Relationship-based Extractor (MGRExtractor) to produce meta-documents by efficiently pruning heterogeneous graphs. More precisely, it consists of 4 main modules: noise removal, redundancy removal, multi-granularity scoring, and sentence-set selection. In the fine-grained stage, we employ the large configuration of BART as our abstractive summarizer to generate system summaries from the extracted meta-documents. Experimental results on two benchmark datasets show that our framework significantly outperforms strong baselines with comparable parameters, and slightly underperforms methods with a maximum encoding length of 16,384 tokens. For Multi-News and WCEP, automatic evaluation results show that MDS-MGRE achieves an average performance improvement of 1.75% and 8.77% compared to the state-of-the-art systems with comparable parameters, respectively. Such positive results demonstrate the benefits of generating high-quality meta-documents to enhance MDS by modeling rich document relationships.
- ROUGE-SEM: Better evaluation of summarization using ROUGE combined with semanticsZhang, Ming, Li, Chengzhang, Wan, Meilin, Zhang, Xuejun, and Zhao, QingweiExpert Systems with Applications 2024
With the development of pre-trained language models and large-scale datasets, automatic text summarization has attracted much attention from the community of natural language processing, but the progress of automatic summarization evaluation has stagnated. Although there have been efforts to improve automatic summarization evaluation, ROUGE has remained one of the most popular metrics for nearly 20 years due to its competitive evaluation performance. However, ROUGE is not perfect, there are studies have shown that it is suffering from inaccurate evaluation of abstractive summarization and limited diversity of generated summaries, both caused by lexical bias. To avoid the bias of lexical similarity, more and more meaningful embedding-based metrics have been proposed to evaluate summaries by measuring semantic similarity. Due to the challenge of accurately measuring semantic similarity, none of them can fully replace ROUGE as the default automatic evaluation toolkit for text summarization. To address the aforementioned problems, we propose a compromise evaluation framework (ROUGE-SEM) for improving ROUGE with semantic information, which compensates for the lack of semantic awareness through a semantic similarity module. According to the differences in semantic similarity and lexical similarity, summaries are classified into four categories for the first time, including good-summary, pearl-summary, glass-summary, and bad-summary. In particular, the back-translation technique is adopted to rewrite pearl-summary and glass-summary that are inaccurately evaluated by ROUGE to alleviate lexical bias. Through this pipeline framework, summaries are first classified by candidate summary classifier, then rewritten by categorized summary rewriter, and finally scored by rewritten summary scorer, which are efficiently evaluated in a manner consistent with human behavior. When measured using Pearson, Spearman, and Kendall rank coefficients, our proposal achieves comparable or higher correlations with human judgments than several state-of-the-art automatic summarization evaluation metrics in dimensions of coherence, consistency, fluency, and relevance. This also suggests that improving ROUGE with semantics is a promising direction for automatic summarization evaluation.
- MCRSpell: A metric learning of correct representation for Chinese spelling correctionLi, Chengzhang, Zhang, Ming, Zhang, Xuejun, and Yan, YonghongExpert Systems with Applications 2024
Chinese spelling correction (CSC) is a difficult but gratifying work that not only helps individuals read and understand the material in their daily lives but also serves as pre-processing for numerous natural language processing (NLP) applications. With pre-trained language models like BERT, many recent works have had great success. However, these methods have three limitations. Firstly, they rely on the output of the last layer of the correction model for parameter updates, ignoring the guidance of intermediate features. This can prevent the model from being adequately trained and thus lead to overfitting. Secondly, they use a variety of data augmentations, which depend on expert knowledge. Some specific augmentation rules only improve the performance on the target dataset, and for out-of-set data, the performance decreases instead. Thirdly, due to the nature of statistical models, they tend to transform low-frequency expressions into more common high-frequency expressions, although correct expressions should be preserved as much as possible for error correction tasks. In this work, we propose a novel general framework for CSC that takes advantage of metric learning to learn semantic knowledge adaptively from multiple intermediate features. Our approach only feeds the input text and target text of the parallel corpus to the model separately for the spelling error correction task without any data augmentation. Moreover, we designed a replication mechanism to keep those low-frequency correct expressions instead of over-correcting them. Our suggested solution can greatly outperform the SOTA methods in the CSC problem, according to extensive trials on benchmarks. We will release the source code for further use by the community.
- Enhancing the Reliability of SC PUF Through Optimal Capacitor ConfigurationWang, Zhou, Zhang, Yin, Ma, Yingchen, Zhang, Ming, He, Zhangqing, and Wan, MeilinIEEE Transactions on Circuits and Systems I: Regular Papers 2024
- An efficient loss function and deep learning approach for ranking stock returns in the absence of prior knowledgeYang, Jiahao, Feng, Shuo, Zhang, Wenkai, Zhang, Ming, Zhou, Jun, and Zhang, PengyuanInformation Processing & Management 2024
To pursue profit from the dynamic, complex, and noisy stock markets, various efforts utilizing deep learning methods to forecast asset price movements have sprung up. We observe that there are two issues in the current work. Firstly, there exists a discrepancy between the forecasting target and actual profitability, as achieving better forecasting results does not necessarily guarantee higher profits. Secondly, many existing methods heavily rely on prior knowledge during the forecasting process, which entails the need for information gathering and may not adapt well to dynamic and complex market conditions. For the first issue, we design a novel reward learning loss function as an optimization object to better model the bridge between the forecasting target and the profit from the trading process. To solve the second issue, we propose a structure based on multi-head attention to model the inter-stock relations directly from trading data without relying on prior knowledge. Also, we present a simple time-asynchronous attention-based method to model the lead–lag phenomenon in the market. We conduct experiments using over 600 stocks of the CSI100, CSI300, and CSI500 indexes from 2010 to 2020 with five strong baselines. The experimental results demonstrate that our methods achieve annualized returns of 5%, 10%, and 13% for long and 5%, 6%, and 8% for short above the optimal baseline results on the three indexes. Further analysis shows that our RL-Loss is better than classic PR-Loss, and the inter-stock relation modeling methods proposed without prior knowledge are effective.
2022
- Predicting long-term stock movements with fused textual features of Chinese research reportsZhang, Ming, Yang, Jiahao, Wan, Meilin, Zhang, Xuejun, and Zhou, JunExpert Systems with Applications 2022
By shaping investors’ perceptions and assessments of the stock, research reports have significant impacts on the stock market. Due to the limitations of text mining technology, it is difficult for researchers to effectively utilize long research reports, and most studies mainly focus on investor sentiment. However, due to the lack of appropriate open-domain toolkits, the annotations of sentiment often require expensive manual labeling. In addition, most existing studies have shown the success of using textual data as a supplement to historical price data in short-term forecasting, but not in long-term forecasting. To cover this gap and solve the problem of difficult annotations, we introduce a novel knowledge-driven approach for long-term stock movement prediction based on Chinese research reports. In detail, a new long-term Stock Movement Prediction dataset composed of Research Reports is proposed, namely SMPRR. It is mainly composed of long, formal, and professional research reports and historical prices. Furthermore, we propose the Multi-module Feature Fusion method based on the pre-trained language model FinBERT (MFF-FinBERT), which can effectively fuse textual features from research reports. The experiment results show that the proposed model has achieved better performance than existing methods in the forecasting of one-year stock movements, and the accuracy reaches 79.2%. The results also indicate that the basic information of stocks plays an important role in long-term forecasting, which is in line with the theory of value investing.