<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2022.762402</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>The Use of Deep Learning-Based Intelligent Music Signal Identification and Generation Technology in National Music Teaching</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Tang</surname> <given-names>Hui</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Zhang</surname> <given-names>Yiyao</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname> <given-names>Qiuying</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1057417/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>School of Arts, Hunan City University</institution>, <addr-line>Yiyang</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Music, Chugye University for the Arts</institution>, <addr-line>Seoul</addr-line>, <country>South Korea</country></aff>
<aff id="aff3"><sup>3</sup><institution>College of Art and Communication, Beijing Normal University</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<aff id="aff4"><sup>4</sup><institution>College of Art, Yunnan Minzu University</institution>, <addr-line>Kunming</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Fouzi Harrou, King Abdullah University of Science and Technology, Saudi Arabia</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Chiranjibi Sitaula, Monash University, Australia; Zhicheng Yang, PAII Inc., United States</p></fn>
<corresp id="c001">&#x002A;Correspondence: Yiyao Zhang, <email>11112018044@bnu.edu.cn</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Auditory Cognitive Neuroscience, a section of the journal Frontiers in Psychology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>22</day>
<month>06</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>762402</elocation-id>
<history>
<date date-type="received">
<day>07</day>
<month>09</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>29</day>
<month>04</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Tang, Zhang and Zhang.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Tang, Zhang and Zhang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>The research expects to explore the application of intelligent music recognition technology in music teaching. Based on the Long Short-Term Memory network knowledge, an algorithm model which can distinguish various music signals and generate various genres of music is designed and implemented. First, by analyzing the application of machine learning and deep learning in the field of music, the algorithm model is designed to realize the function of intelligent music generation, which provides a theoretical basis for relevant research. Then, by selecting massive music data, the music style discrimination and generation model is tested. The experimental results show that when the number of hidden layers of the designed model is 4 and the number of neurons in each layer is 1,024, 512, 256, and 128, the training result difference of the model is the smallest. The classification accuracy of jazz, classical, rock, country, and disco music types can be more than 60% using the designed algorithm model. Among them, the classification effect of jazz schools is the best, which is 77.5%. Moreover, compared with the traditional algorithm, the frequency distribution of the music score generated by the designed algorithm is almost consistent with the spectrum of the original music. Therefore, the methods and models proposed can distinguish music signals and generate different music, and the discrimination accuracy of different music signals is higher, which is superior to the traditional restricted Boltzmann machine method.</p>
</abstract>
<kwd-group>
<kwd>deep learning</kwd>
<kwd>music style</kwd>
<kwd>Long Short-Term Memory network</kwd>
<kwd>psychology</kwd>
<kwd>quality education</kwd>
</kwd-group>
<counts>
<fig-count count="9"/>
<table-count count="1"/>
<equation-count count="4"/>
<ref-count count="34"/>
<page-count count="9"/>
<word-count count="5490"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>Introduction</title>
<p>In the Internet age, the concept of &#x201C;music without borders&#x201D; is accepted by more people. There are some differences in music expression in different countries and regions, and the thoughts and emotions contained in music can always resonate with people. Music fully expresses its value in human life (<xref ref-type="bibr" rid="B33">Yuan and Wu, 2020</xref>). Music production is a way of artistic expression of people&#x2019;s thoughts and feelings with music as the carrier. Therefore, the music contains people&#x2019;s most sincere feelings, which need people to feel through hearing. Research on the psychological changes in music teaching is conducive to understanding the changes in individual behavior and psychological cognition in the process of teaching.</p>
<p>There are countless kinds and quantities of musical instruments in the world, and the storage methods of music files have also become diversified. Music genres have gradually formed based on the emergence of musical instruments and the diversification of music storage methods. Jazz, classical music, pop music, hip-hop, and rock music have become familiar words. Now, the traditional music arrangement and music information retrieval have been gradually replaced by computer technology. Digital audio processing, speech recognition, speech compression coding, and text speech conversion have become increasingly diversified and accurate under the revolution of information technology. <xref ref-type="bibr" rid="B23">Sitaula et al. (2021)</xref> studied the application of neural network technology to the classification of intestinal peristaltic and non-peristaltic sounds. They optimized the classification results by Laplace hidden semi-Markov model. The experimental results show that this method can enhance the accuracy of bowel sound detection and promote the possibility of telemedicine application in neonatal nursing in the future. With the wide application of machine learning (<xref ref-type="bibr" rid="B15">Ma et al., 2021</xref>) and deep learning (DL) (<xref ref-type="bibr" rid="B18">Quazi et al., 2021</xref>) in face recognition, speech recognition, and image recognition, people are gradually trying to apply this technology to the field of music generation. <xref ref-type="bibr" rid="B9">Hongdan et al. (2022)</xref> studied the application of machine learning technology to the recognition and classification of recording genres. Moreover, a model based on a convolutional neural network (CNN) was proposed to identify the spectrum of recorded audio through training. Moreover, the time and frequency domains&#x2019; features were extracted from the audio signal, combined into the machine learning model, and trained to classify the audio files. The research shows that classification accuracy is largely affected by feature selection for classification. In some classification systems, training errors may affect the model&#x2019;s output. <xref ref-type="bibr" rid="B16">Nag et al. (2022)</xref> proposed an algorithm model based on CNN to identify the emotion of the music contained in Indian classical music. The database of 1,600 emotion fragments extracted from Indian classical music was established, and the emotion in music was classified by the method based on CNN. <xref ref-type="bibr" rid="B14">Li et al. (2022)</xref> studied the application of deep neural networks to music classification and used a spectrum diagram to evaluate the model&#x2019;s performance. The music audio file was converted into a spectrum through modal transformation, and then the music was classified through DL. The experimental results show that the experimental results of the proposed model are always better than those of other neural network models. Deep learning is more powerful than machine learning in storing and processing massive data (<xref ref-type="bibr" rid="B26">Wu W. et al., 2020</xref>). Hence, more deep neural networks are used in music analysis and processing, especially Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) networks.</p>
<p>RNN is first applied to music classification, but the effect is not very ideal. Due to the large correlation between the front and back notes, the data of the previous time or earlier cannot be obtained by using ordinary RNN, which makes the classification effect or the acquisition of musical features such as tone, timbre, loudness, and rhythm inaccurate. People have improved RNN and added forget gate on the original basis. It overcomes the problem of recording the connection between long spatiotemporal data, enables RNN to record previous relevant data information, and successfully overcomes the problem of long-time and spatial sequence. At present, more people use the LSTM network for emotion analysis and processing and some intelligent recommendation models. Neural network technology is used to achieve intelligent music recognition, and the ability of the designed algorithm to deal with related problems is improved by optimizing the recognition process. DL technology will be adopted to realize the intelligent recognition and generation of music signals and improve the algorithm&#x2019;s output by optimizing the model parameters and structure.</p>
</sec>
<sec id="S2" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S2.SS1">
<title>Analysis of National Music Teaching Based on Psychology</title>
<p>Music education psychology is the research on the changes in psychological activities in music teaching. It is the product of the combination of psychology and education. Besides, psychology can be adopted to study the changes in people&#x2019;s psychological law in teaching. In cognitive psychology, the occurrence and defense of feelings such as feeling, attention, consciousness, knowledge, and gene can be systematically explained to provide a reference for people&#x2019;s research on cognitive activities such as imagination, meaning, and thinking. The psychological activities related to cognitive psychology and sound provide a basis for studying new three-dimensional characters in national music teaching.</p>
<p>Music content and emotional expression are crucial contents running through the three links of creation, performance, and listening. Each link gives music special significance and vitality, and involves cognitive activities in people&#x2019;s feelings, perceptions, and consciousness. <xref ref-type="bibr" rid="B31">Wu et al. (2018)</xref> studied the application of information and communication technology to traditional teaching to improve the effectiveness of teaching and training (<xref ref-type="bibr" rid="B31">Wu et al., 2018</xref>). Hence, the effect of introducing information technology and DL into the national music teaching classroom is analyzed by combining educational psychology to effectively make an accurate judgment on the application of intelligent music signal identification and generation technology in national music teaching (<xref ref-type="bibr" rid="B28">Wu and Song, 2019</xref>). Hence, psychology can be employed to analyze the psychological cognitive process of applying DL technology to national music teaching.</p>
</sec>
<sec id="S2.SS2">
<title>The Music Genre and Timbre Characteristics</title>
<p>The most basic part of identifying music signals is the classification of music genres. Music is divided into multiple genres according to its characteristics, and the characteristics of different genres are also quite different (<xref ref-type="bibr" rid="B11">Jiang et al., 2020</xref>). However, music genres have similarities and differences (<xref ref-type="bibr" rid="B13">Kim and Oh, 2021</xref>). At present, the most widely recognized classification structure in the world mainly includes the GTZAN Genre and ISMIR2004 Genre. The GTZAN Genre mainly divides music genres into 10 categories: blues, country, hip-hop, jazz, pop, disco, classical, rock, reggae, and metal. The ISMIR2004 Genre mainly divides music into six genres: classical, electronic, jazz/blues, metal/punk, and rock/pop (<xref ref-type="bibr" rid="B17">Ng et al., 2020</xref>).</p>
<p>Music characteristic is the embodiment of the essential attribute of music. It is crucial to extract the characteristics of music to distinguish different music styles and genres (<xref ref-type="bibr" rid="B3">Caparrini et al., 2020</xref>). At present, there are two main types of music features: physical features and time-domain features. Short-term features are divided based on human sensory characteristics, mainly including tone, timbre, and loudness, which can be expressed by specific numerical features (<xref ref-type="bibr" rid="B25">Wick and Puppe, 2021</xref>). Some time-domain features cannot be expressed by specific numbers. The details are as follows:</p>
<p>(1) The short-time energy represents the amplitude of the music signal at a certain time. The calculation reads:</p>
<disp-formula id="S2.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mtable displaystyle="true" rowspacing="0pt">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>&#x03C9;</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi mathvariant="normal">&#x221E;</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mi mathvariant="normal">&#x221E;</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>&#x03D5;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mi/>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
</mml:mstyle>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>&#x03D5;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>
<p><italic>n</italic> represents the <italic>n</italic>-th sampling point, &#x03B8;(<italic>m</italic>) is the signal value of the sampling point, &#x03D5;(<italic>n</italic>&#x2212;<italic>m</italic>) represents the window function, and <italic>A</italic> donates the window length (<xref ref-type="bibr" rid="B29">Wu and Wu, 2017</xref>).</p>
<p>(2) A crucial index to measure the high-frequency component of a signal is the short-time average cross zero ratios. In the waveform analysis, the more the high-frequency components are, the more the zero-crossing times are. Equation (2) donates this feature:</p>
<disp-formula id="S2.E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:msub>
<mml:mi mathvariant="normal">&#x03BB;</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mtext>sgn</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mtext>sgn</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>&#x03B8;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>&#x03D5;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>&#x03BB;<sub><italic>n</italic></sub> represents the short-time zero-crossing rate. &#x03B8;(<italic>m</italic>) donates the signal value of the sampling point. &#x03D5;(<italic>n</italic>&#x2212;<italic>m</italic>) is the window function, and <italic>A</italic> is the window length; sgn represents the symbol function, that is, the value is 1 when <italic>x</italic>(<italic>m</italic>)&#x2265;0. Otherwise, the value is 0.</p>
</sec>
<sec id="S2.SS3">
<title>Musical Instrument Digital Interface Music</title>
<p>Audio storage formats mainly include MP3, Windows Media Audio (WMA), MIDI, and WaveForm (WAV). There are usually three types of events in music processing, MIDI events, system-specific events and meta events (<xref ref-type="bibr" rid="B27">Wu et al., 2019</xref>). The storage of audio is of great significance for music discrimination and generation (<xref ref-type="bibr" rid="B22">Siphocly et al., 2021</xref>). MIDI music is adopted here.</p>
</sec>
<sec id="S2.SS4">
<title>Design of Music Style Recognition and Generation Model</title>
<p>The most crucial thing in music signal identification is the identification of music style (<xref ref-type="bibr" rid="B19">Ram&#x00ED;rez and Flores, 2020</xref>). The function of the model is to extract music features from different styles and genres of music in the music library through track separation technology, perform vectorization processing of the extracted music feature data, and then train the model. The LSTM network in DL is mainly adopted to enable computers to produce different styles of music (<xref ref-type="bibr" rid="B24">Sun et al., 2020</xref>). <xref ref-type="fig" rid="F1">Figure 1</xref> displays a flow chart.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Flow chart of music style recognition and generation model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g001.tif"/>
</fig>
</sec>
<sec id="S2.SS5">
<title>Data Preprocessing</title>
<p>Music data can be trained after processing (<xref ref-type="bibr" rid="B10">Hughes et al., 2018</xref>). In this model, track separation, music feature extraction, and data vectorization are mainly performed on the data (<xref ref-type="bibr" rid="B7">Gunawan et al., 2020</xref>). <xref ref-type="fig" rid="F2">Figure 2</xref> presents the architectural design of the model. MIDI music contains three data types, represented by 0, 1, and 2; 0 means that there is only one track; 1 means that multiple tracks will start playing in the same time series and at the same beat; 2 means that multiple tracks can be selected freely without starting simultaneously. Singletrack and independent multi-track MIDI files are relatively simple to extract. Since the tracks of independent multi-tracks are independent, the header files of the tracks need to be traversed in turn (<xref ref-type="bibr" rid="B6">Chen, 2019</xref>). All tracks of the music in format 1 are synchronized, so it is troublesome to separate, and the tracks need to be spliced. Music features are usually divided into physical features and time-domain features. Time-domain features can be only displayed by specific instruments (<xref ref-type="bibr" rid="B20">Rogoza et al., 2018</xref>). To better obtain the effective loudness, MIDI music extraction data containing at least 45 different loudness (the number of each loudness exceeds 20) are selected to become effective music data (<xref ref-type="bibr" rid="B30">Wu Y. J. et al., 2020</xref>). The effective music data are converted into a music score matrix as the input vector of the network. Meanwhile, the music score matrix needs to reflect the structure of MIDI and the characteristics of music in the form of vectors.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Flow chart of data preprocessing.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g002.tif"/>
</fig>
</sec>
<sec id="S2.SS6">
<title>Design of Music Genre Generation Model</title>
<p>The music library contains music of various genres, and each genre has the same style. The system mainly needs to acquire the music characteristics of MIDI music in different genres, train the network, and then generate music of different styles (<xref ref-type="bibr" rid="B34">Zheng et al., 2018</xref>). The algorithm includes a music genre analysis model and a music style generation model. <xref ref-type="fig" rid="F3">Figure 3</xref> presents the model architecture. The music genre analysis model divides the learning problem into two parts. The first part is employed to learn the music features in the score and convert them into feature vectors, and the second part is to obtain the range of music intensity (<xref ref-type="bibr" rid="B1">Ahn et al., 2020</xref>).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Flow chart of music genre generation model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g003.tif"/>
</fig>
<p>Different genres of music have different musical characteristics. In this system, MIDI music of piano is mainly used. The main function of the music genre analysis model is to learn the style of music by learning a specific music genre, such as jazz, pop, and classical. (<xref ref-type="bibr" rid="B32">Yang and Lerch, 2020</xref>). Finally, different music genres will be distinguished according to the different music intensities. The music intensity matrix can be generated in the prediction part. The model mainly includes a bidirectional LSTM network layer and a linear layer (<xref ref-type="bibr" rid="B8">Hawley et al., 2020</xref>). The input of the model is a specific genre of music. Feature learning is carried out through a bidirectional LSTM network. The matrix containing sound intensity is generated through the linear layer. Finally, it is converted into music with the musical style that can be played (<xref ref-type="bibr" rid="B5">Chang et al., 2021</xref>).</p>
<p>The music sequences in the music library are different. Some music sequences are very long, while some are very short. At present, a bidirectional LSTM network has a good effect in dealing with long-time series problems (<xref ref-type="bibr" rid="B2">Briot, 2021</xref>). Bidirectional LSTM is more complex than unidirectional LSTM, which is mainly reflected in the value propagation process. Meanwhile, it needs more training times to optimize the parameters, while the unidirectional LSTM does not need multiple training times. The accuracy of training results of a bidirectional LSTM network is much higher than that of unidirectional LSTM (<xref ref-type="bibr" rid="B12">Jin et al., 2020</xref>) after many times of training. <xref ref-type="fig" rid="F4">Figure 4</xref> displays the LSTM network structure.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Bidirectional LSTM architecture.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g004.tif"/>
</fig>
<p>The main purpose of the model design is to generate music with a music style. As mentioned earlier, the final music genre style can be distinguished by the strength of music performance. Then, the music of different genres will have different strengths. The range of performance intensity is continuous and large, so it is essential to convert the output value into music strength value and change the output value range through the linear layer.</p>
</sec>
<sec id="S2.SS7">
<title>Design of Music Style Analysis Model</title>
<p>The music style analysis model is mainly adopted to learn more complex style information that cannot be trained in a music genre analysis network. The LSTM neural network is mainly used in this model. <xref ref-type="fig" rid="F5">Figure 5</xref> is the structural design of the music style analysis model.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Structure diagram of music style analysis model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g005.tif"/>
</fig>
<p>The model is to study whether computers can learn and generate different music like people. The model mainly includes the interpretation layer and the subnet of the music genre analysis network. At present, the network is a multi-task learning model. MIDI music can be regarded as a music score. First, it passes through the interpretation layer, and then the output of the interpretation layer is adopted as the input of the music genre analysis network. After it passes through the music genres network, a matrix containing music characteristics will be output. Music analysis networks of different genres will generate matrixes of different genres. The matrix is converted into music that can be played (<xref ref-type="bibr" rid="B21">Shen et al., 2020</xref>).</p>
<p>The samples of music data to be analyzed are much fewer than other data, such as user information. The number of categories is relatively large, and there is no obvious distinction rule for each category. Hence, some scholars put forward the Siamese network, which is a similarity measurement method. It maps the input to the target space through a function and compares the similarity in the target space using Euclidean distance (<xref ref-type="bibr" rid="B4">Castillo and Flores, 2021</xref>).</p>
<p>In music style analysis, multiple units of the music genre analysis subnet need to be designed in the music style analysis model to better learn different music styles, because music has different genres. Each subnet is connected with the interpretation layer. The output of the interpretation layer is used as the input of the subnet unit. The music style analysis model includes multiple music genre analysis grids, and the biggest difference is an additional interpretation layer. The time of model training is reduced through the interpretation layer. The use of a multitasking mechanism can improve the efficiency of training and deal with the analysis of several kinds of music simultaneously.</p>
<p>A deep belief network combined with the Softmax algorithm is designed to classify the generated music genres. Equation (3) is the standard adopted to evaluate the accuracy of genre classification:</p>
<disp-formula id="S2.E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:mi>Q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mi>N</mml:mi>
<mml:mi>M</mml:mi>
</mml:mfrac>
<mml:mo>&#x00D7;</mml:mo>
<mml:mrow>
<mml:mn>100</mml:mn>
<mml:mo>%</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p><italic>N</italic> represents the number of correctly identified music genres, <italic>M</italic> is the total number of music samples tested, and <italic>Q</italic> represents the accuracy.</p>
<p>The algorithm with the best classification effect among the music genre classification algorithms is selected in the experiment. The algorithm is based on the deep belief network in DL. After the network improvement, Softmax is adopted to predict the genre of music. RBM network belongs to a random network, and its particularity is mainly reflected in two aspects. The first is the probability distribution function. The node state of the network is random. The other is the energy function, which represents the stability of the network state. The greater the energy value is, the stabler the network state is. Equation (4) displays the definition equation of energy in the network.</p>
<disp-formula id="S2.E4">
<label>(4)</label>
<mml:math id="M4">
<mml:mrow>
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munderover>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>&#x03C8;</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>&#x03C8;<sub><italic>i</italic></sub> represents the value of neurons and <italic>w</italic><sub><italic>ij</italic></sub> donates the connection weight between neurons.</p>
</sec>
</sec>
<sec id="S3">
<title>Results and Analysis</title>
<sec id="S3.SS1">
<title>Analysis of Experimental Results of Music Style Recognition and Generation</title>
<p>Iterations 1,000, 2,000, 3,000, 4,000, 5,000, and 6,000 are conducted in the experiment to find the best-generated music sequence. The error value of the experiment is increasingly smaller with the increasing iteration times of the LSTM network, which shows that the actual output value is increasingly closer to the target value, and the training results are increasingly accurate. <xref ref-type="fig" rid="F6">Figure 6</xref> displays the influence of iteration times on the training effect.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption><p>Influence of different iteration times on training effect.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g006.tif"/>
</fig>
<p>In the model design, the influence of the number of neurons in the hidden layer on the experiment is analyzed to optimize the model&#x2019;s parameters. During the experiment, variables are set for the number of hidden layers and the corresponding number of neurons. The number of hidden layers is 1, 2, 3, and 4, and the number of neurons is 1,024, 512, 256, and 128, respectively. <xref ref-type="fig" rid="F7">Figure 7</xref> displays the effect of hidden layer neurons on experimental error. The number of hidden layers of the network is set to 4 layers, and the number of neurons in each layer is 1,024, 512, 256, and 128, which can minimize the optimal difference of the network training results.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption><p>Influence of the number of hidden layer neurons on error.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g007.tif"/>
</fig>
</sec>
<sec id="S3.SS2">
<title>Analysis of the Generated Music Sequence Spectrum and Original Music Sequence Spectrum</title>
<p>In the experiment, different genres of music are processed by a fast Fourier transform. Then, the obtained music is analyzed by spectrum analysis and sound spectrum analysis to ensure the accuracy of the experimental data. The music sequence spectrum is analyzed during the experiment for the music sequence generated under different hidden layers. <xref ref-type="fig" rid="F8">Figure 8</xref> displays the generated music spectrum and sample spectrum. It shows that the effect of LSTM on music analysis is still obvious. The music spectrum at the training place is increasingly closer to the original spectrum with the increase of the number of the neural network layer, indicating that the accuracy is increasingly higher. <xref ref-type="fig" rid="F8">Figure 8A</xref> displays the original music. <xref ref-type="fig" rid="F8">Figure 8B</xref> shows that the learned music contains multiple unknown frequencies when there is only one hidden layer. <xref ref-type="fig" rid="F8">Figure 8C</xref> displays that some frequencies do not appear when there are two layers. <xref ref-type="fig" rid="F8">Figure 8D</xref> shows that the generated music sequence file is very similar to the original music sequence file with three layers, but there are still some differences. <xref ref-type="fig" rid="F8">Figure 8E</xref> reveals that the difference between the generated music sequence and the original music sequence is very small when there are four layers, suggesting that the generated music is the most accurate with four hidden layers.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption><p>Generated music spectrum and sample spectrum <bold>(A)</bold> original music; <bold>(B)</bold> one hidden layer; <bold>(C)</bold> two hidden layers; <bold>(D)</bold> three hidden layers; <bold>(E)</bold> four hidden layers.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g008.tif"/>
</fig>
</sec>
<sec id="S3.SS3">
<title>Generated Music Genre Classification Results</title>
<p>Overall, 5 genres of music are selected, and each music is in MIDI format, with 40 pieces of music of each type.</p>
<p>The classification accuracy of using a deep belief network combined with a Softmax algorithm is as follows (<xref ref-type="table" rid="T1">Table 1</xref>). The classification accuracy of jazz, classical, rock, country, and disco genres reaches 77.5, 65, 60, 67.5, and 70%, respectively. The analysis of experimental data reveals that the designed algorithm has better classification accuracy than the traditional algorithm. It shows that the music style and genre recognition and generation network in this experiment can generate music of different genres, and the accuracy rate is 60% or more.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Experimental comparative analysis.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Music genre</td>
<td valign="top" align="center">Design method</td>
<td valign="top" align="center">Traditional method</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Jazz</td>
<td valign="top" align="center">77.5%</td>
<td valign="top" align="center">45%</td>
</tr>
<tr>
<td valign="top" align="left">Classical music</td>
<td valign="top" align="center">65%</td>
<td valign="top" align="center">34%</td>
</tr>
<tr>
<td valign="top" align="left">Rock</td>
<td valign="top" align="center">60%</td>
<td valign="top" align="center">42%</td>
</tr>
<tr>
<td valign="top" align="left">Country</td>
<td valign="top" align="center">67.5%</td>
<td valign="top" align="center">55%</td>
</tr>
<tr>
<td valign="top" align="left">Disco</td>
<td valign="top" align="center">70%</td>
<td valign="top" align="center">63%</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S3.SS4">
<title>Analysis of Two Genres of Generative Music</title>
<p>The classification results show that the music style recognition and generation algorithm has a good performance in generating different genres of music. The spectrum of music generated by two different generation models: LSTM and RBM, are analyzed in this experiment.</p>
<p>The experimental results of different algorithms are compared in <xref ref-type="fig" rid="F9">Figure 9</xref>. It reveals that the music generated by traditional RBM is the same as the original music, but the accuracy is not as high as that of the method proposed. Therefore, the designed algorithm can better generate different genres of music.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption><p>Spectrum diagram of original music, RBM, and the method proposed <bold>(A)</bold> original music; <bold>(B)</bold> RBM; <bold>(C)</bold> the method proposed. <bold>(A)</bold> Is a spectrum diagram of the original music. It reveals that the value of the original sample music does not exceed 20 when the frequency is about 5,000 HZ, but it significantly exceeds in <bold>(B,C)</bold> suggests that the music spectrogram generated by using the method proposed is consistent with the original music spectrogram, both in the overall frequency distribution and the sample frequency distribution.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpsyg-13-762402-g009.tif"/>
</fig>
</sec>
</sec>
<sec id="S4" sec-type="conclusion">
<title>Conclusion</title>
<p>In recent years, the rise of DL and machine learning and the rapid progress of computer software and hardware performance have laid a good foundation for the automatic generation of music of different genres. Before that, most researchers used DL networks for music genre classification and recognition but rarely used music generation. Therefore, using the LSTM network for music generation has a certain research significance. The LSTM network has a good effect in dealing with long-time series problems, so it is often used for semantic analysis. On the premise of this foundation, an attempt is made to apply the LSTM network to music generation. There are different genres in music. On the premise of having a certain understanding of the LSTM network, the network model is redesigned using the relevant knowledge. Then, the network can generate multi-task music styles of different genres and improve the training efficiency simultaneously.</p>
<p>Finally, a network that can generate different genre music styles is designed, and the music data in the GTZAN music genre library are used as experimental data for testing. Moreover, the audio and spectrogram of the generated music and the original music are compared and analyzed. The analysis of the spectrum and sound spectrum of the generated music sequence and the original music sequence shows that the network has a good performance in music generation. However, this exploration still has some limitations, and the accuracy of the designed algorithm needs to be further improved. Besides, the music files studied are in MIDI format.</p>
</sec>
<sec id="S5" sec-type="data-availability">
<title>Data Availability Statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="S6">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by the Ethics Committee of Yunnan Minzu University. The patients/participants provided their written informed consent to participate in this study. Written informed consent was obtained from the individual(s) for the publication of any potentially identifiable images or data included in this article.</p>
</sec>
<sec id="S7">
<title>Author Contributions</title>
<p>All authors listed have made a substantial, direct, and intellectual contribution to the work, and approved it for publication.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="pudiscl1" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<sec id="S8" sec-type="funding-information">
<title>Funding</title>
<p>This work was supported by Annual Project of Hunan Provincial Philosophy and Social Science Fund Annual Project Youth Project &#x201C;Dongting Cultural Circle Ethnic Traditional Music Intangible Cultural Heritage Digital Protection and Active Inheritance Project&#x201d; (No. 21YBQ081), Hunan Provincial Education Science &#x201C;14th Five-Year&#x201D; Planning Topic &#x201C;Local Colleges Music Professional Service Rural Revitalization Innovation Approach&#x201D; (No. ND212272).</p>
</sec>
<sec id="S9" sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpsyg.2022.762402/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpsyg.2022.762402/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.ZIP" id="DS1" mimetype="application/zip" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_1.XLSX" id="TS1" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Table_2.XLSX" id="TS2" mimetype="application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ahn</surname> <given-names>H.</given-names></name> <name><surname>Kim</surname> <given-names>J.</given-names></name> <name><surname>Kim</surname> <given-names>K.</given-names></name> <name><surname>Oh</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>Generative autoregressive networks for 3d dancing move synthesis from music.</article-title> <source><italic>IEEE Robot. Autom. Lett.</italic></source> <volume>5</volume> <fpage>3500</fpage>&#x2013;<lpage>3507</lpage>. <pub-id pub-id-type="doi">10.1109/LRA.2020.2977333</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Briot</surname> <given-names>J. P.</given-names></name></person-group> (<year>2021</year>). <article-title>From artificial neural networks to deep learning for music generation: history, concepts and trends.</article-title> <source><italic>Neural. Comput. Appl.</italic></source> <volume>33</volume> <fpage>39</fpage>&#x2013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-020-05399-0</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Caparrini</surname> <given-names>A.</given-names></name> <name><surname>Arroyo</surname> <given-names>J.</given-names></name> <name><surname>P&#x00E9;rez-Molina</surname> <given-names>L.</given-names></name> <name><surname>S&#x00E1;nchez-Hern&#x00E1;ndez</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <article-title>Automatic subgenre classification in an electronic dance music taxonomy.</article-title> <source><italic>J. New. Music Res.</italic></source> <volume>49</volume> <fpage>269</fpage>&#x2013;<lpage>284</lpage>. <pub-id pub-id-type="doi">10.1080/09298215.2020.1761399</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Castillo</surname> <given-names>J. R.</given-names></name> <name><surname>Flores</surname> <given-names>M. J.</given-names></name></person-group> (<year>2021</year>). <article-title>Web-based music genre classification for timeline song visualization and analysis.</article-title> <source><italic>IEEE Access</italic></source> <volume>9</volume> <fpage>18801</fpage>&#x2013;<lpage>18816</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2021.3053864</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chang</surname> <given-names>J. W.</given-names></name> <name><surname>Hung</surname> <given-names>J. C.</given-names></name> <name><surname>Lin</surname> <given-names>K. C.</given-names></name></person-group> (<year>2021</year>). <article-title>Singability-enhanced lyric generator with music style transfer.</article-title> <source><italic>Comput. Commun.</italic></source> <volume>168</volume> <fpage>33</fpage>&#x2013;<lpage>53</lpage>. <pub-id pub-id-type="doi">10.1016/j.comcom.2021.01.002</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>The impact of expatriates&#x2019; cross-cultural adjustment on work stress and job involvement in the high-tech Industry.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>10</volume>:<issue>2228</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2019.02228</pub-id> <pub-id pub-id-type="pmid">31649581</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gunawan</surname> <given-names>A. A. S.</given-names></name> <name><surname>Iman</surname> <given-names>A. P.</given-names></name> <name><surname>Suhartono</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>Automatic music generator using recurrent neural network.</article-title> <source><italic>Int. J. Comput. Int. Syst.</italic></source> <volume>13</volume> <fpage>645</fpage>&#x2013;<lpage>654</lpage>. <pub-id pub-id-type="doi">10.2991/ijcis.d.200519.001</pub-id> <pub-id pub-id-type="pmid">32175718</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hawley</surname> <given-names>S. H.</given-names></name> <name><surname>Chatziiannou</surname> <given-names>V.</given-names></name> <name><surname>Morrison</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Synthesis of musical instrument sounds: physics-based modeling or machine learning.</article-title> <source><italic>Phys. Today</italic></source> <volume>16</volume> <fpage>20</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1121/AT.2020.16.1.20</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hongdan</surname> <given-names>W.</given-names></name> <name><surname>SalmiJamali</surname> <given-names>S.</given-names></name> <name><surname>Zhengping</surname> <given-names>C.</given-names></name> <name><surname>Qiaojuan</surname> <given-names>S.</given-names></name> <name><surname>Le</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>An intelligent music genre analysis using feature extraction and classification using deep learning techniques.</article-title> <source><italic>Comput. Electr. Eng.</italic></source> <volume>100</volume>:<issue>107978</issue>. <pub-id pub-id-type="doi">10.1016/j.compeleceng.2022.107978</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hughes</surname> <given-names>L. H.</given-names></name> <name><surname>Schmitt</surname> <given-names>M.</given-names></name> <name><surname>Mou</surname> <given-names>L.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Zhu</surname> <given-names>X. X.</given-names></name></person-group> (<year>2018</year>). <article-title>Identifying corresponding patches in SAR and optical images with a pseudo-siamese CNN.</article-title> <source><italic>IEEE Geosci. Remote S.</italic></source> <volume>15</volume> <fpage>784</fpage>&#x2013;<lpage>788</lpage>. <pub-id pub-id-type="doi">10.1109/LGRS.2018.2799232</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>W.</given-names></name> <name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Jiang</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>Analysis and modeling of timbre perception features in musical sounds.</article-title> <source><italic>Math. Mod. Meth. Appl. S.</italic></source> <volume>10</volume>:<issue>789</issue>. <pub-id pub-id-type="doi">10.3390/app10030789</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jin</surname> <given-names>C.</given-names></name> <name><surname>Tie</surname> <given-names>Y.</given-names></name> <name><surname>Bai</surname> <given-names>Y.</given-names></name> <name><surname>Lv</surname> <given-names>X.</given-names></name> <name><surname>Liu</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>A style-specific music composition neural network.</article-title> <source><italic>Neural. Process. Lett.</italic></source> <volume>52</volume> <fpage>1893</fpage>&#x2013;<lpage>1912</lpage>. <pub-id pub-id-type="doi">10.1007/s11063-020-10241-8</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>S. T.</given-names></name> <name><surname>Oh</surname> <given-names>J. H.</given-names></name></person-group> (<year>2021</year>). <article-title>Music intelligence: granular data and prediction of top ten hit songs.</article-title> <source><italic>Decis. Support Syst.</italic></source> <volume>145</volume>:<issue>113535</issue>. <pub-id pub-id-type="doi">10.1016/j.dss.2021.113535</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Han</surname> <given-names>L.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Zhu</surname> <given-names>J.</given-names></name> <name><surname>Yuan</surname> <given-names>B.</given-names></name> <name><surname>Gou</surname> <given-names>Z.</given-names></name></person-group> (<year>2022</year>). <article-title>An evaluation of deep neural network models for music classification using spectrograms.</article-title> <source><italic>Multimed. Tools Appl.</italic></source> <volume>81</volume> <fpage>4621</fpage>&#x2013;<lpage>4647</lpage>. <pub-id pub-id-type="doi">10.1007/s11042-020-10465-9</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ma</surname> <given-names>B.</given-names></name> <name><surname>Greer</surname> <given-names>T.</given-names></name> <name><surname>Knox</surname> <given-names>D.</given-names></name> <name><surname>Narayanan</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>A computational lens into how music characterizes genre in film.</article-title> <source><italic>PLoS One</italic></source> <volume>16</volume>:<issue>e0249957</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0249957</pub-id> <pub-id pub-id-type="pmid">33831109</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nag</surname> <given-names>S.</given-names></name> <name><surname>Basu</surname> <given-names>M.</given-names></name> <name><surname>Sanyal</surname> <given-names>S.</given-names></name> <name><surname>Banerjee</surname> <given-names>A.</given-names></name> <name><surname>Ghosh</surname> <given-names>D.</given-names></name></person-group> (<year>2022</year>). <article-title>On the application of deep learning and multifractal techniques to classify emotions and instruments using Indian classical music.</article-title> <source><italic>Physica A</italic></source> <volume>597</volume>:<issue>127261</issue>. <pub-id pub-id-type="doi">10.1016/j.physa.2022.127261</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ng</surname> <given-names>W. W.</given-names></name> <name><surname>Zeng</surname> <given-names>W.</given-names></name> <name><surname>Wang</surname> <given-names>T.</given-names></name></person-group> (<year>2020</year>). <article-title>Multi-level local feature coding fusion for music genre recognition.</article-title> <source><italic>IEEE Access</italic></source> <volume>8</volume> <fpage>152713</fpage>&#x2013;<lpage>152727</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2020.3017661</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Quazi</surname> <given-names>G. R.</given-names></name> <name><surname>Mohammed</surname> <given-names>N.</given-names></name> <name><surname>Sadia</surname> <given-names>Z. P.</given-names></name> <name><surname>Sabrina</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Comparative analysis of three improved deep learning architectures for music genre classification.</article-title> <source><italic>J. Comput. Sci. Tech. Ch.</italic></source> <volume>2</volume> <fpage>1</fpage>&#x2013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.5815/ijitcs.2021.02.01</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ram&#x00ED;rez</surname> <given-names>J.</given-names></name> <name><surname>Flores</surname> <given-names>M. J.</given-names></name></person-group> (<year>2020</year>). <article-title>Machine learning for music genre: multifaceted review and experimentation with audioset.</article-title> <source><italic>J. Intell. Inf. Syst.</italic></source> <volume>55</volume> <fpage>469</fpage>&#x2013;<lpage>499</lpage>. <pub-id pub-id-type="doi">10.1007/s10844-019-00582-9</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rogoza</surname> <given-names>R.</given-names></name> <name><surname>&#x017B;emojtel-Piotrowska</surname> <given-names>M.</given-names></name> <name><surname>Kwiatkowska</surname> <given-names>M. M.</given-names></name> <name><surname>Kwiatkowska</surname> <given-names>K.</given-names></name></person-group> (<year>2018</year>). <article-title>The bright, the dark, and the blue face of narcissism: the spectrum of narcissism in its relations to the metatraits of personality, self-esteem, and the nomological network of shyness, loneliness, and empathy.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>9</volume>:<issue>343</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2018.00343</pub-id> <pub-id pub-id-type="pmid">29593627</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>R.</given-names></name> <name><surname>Shen</surname> <given-names>H. W.</given-names></name></person-group> (<year>2020</year>). <article-title>Visual exploration of latent space for traditional Chinese music.</article-title> <source><italic>Visual. Neurosci.</italic></source> <volume>4</volume> <fpage>99</fpage>&#x2013;<lpage>108</lpage>. <pub-id pub-id-type="doi">10.1016/j.visinf.2020.04.003</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Siphocly</surname> <given-names>N. N. J.</given-names></name> <name><surname>El-Horbaty</surname> <given-names>E. S. M.</given-names></name> <name><surname>Salem</surname> <given-names>A. B. M.</given-names></name></person-group> (<year>2021</year>). <article-title>Top 10 artificial intelligence algorithms in computer music composition.</article-title> <source><italic>Int. J. Innov. Comput. I.</italic></source> <volume>10</volume> <fpage>373</fpage>&#x2013;<lpage>394</lpage>. <pub-id pub-id-type="doi">10.12785/ijcds/100138</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sitaula</surname> <given-names>C.</given-names></name> <name><surname>He</surname> <given-names>J.</given-names></name> <name><surname>Priyadarshi</surname> <given-names>A.</given-names></name> <name><surname>Tracy</surname> <given-names>M.</given-names></name> <name><surname>Kavehei</surname> <given-names>O.</given-names></name> <name><surname>Hinder</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Neonatal bowel sound detection using convolutional neural network and laplace hidden semi-markov model.</article-title> <source><italic>arXiv 210807467</italic></source> [<comment>Preprint</comment>]. <pub-id pub-id-type="doi">10.48550/arXiv.2108.07467</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sun</surname> <given-names>G.</given-names></name> <name><surname>Wong</surname> <given-names>Y.</given-names></name> <name><surname>Cheng</surname> <given-names>Z.</given-names></name> <name><surname>Kankanhalli</surname> <given-names>M. S.</given-names></name> <name><surname>Geng</surname> <given-names>W.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name></person-group> (<year>2020</year>). <article-title>DeepDance: music-to-dance motion choreography with adversarial learning.</article-title> <source><italic>IEEE T. Multimedia.</italic></source> <volume>23</volume> <fpage>497</fpage>&#x2013;<lpage>509</lpage>. <pub-id pub-id-type="doi">10.1109/TMM.2020.2981989</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wick</surname> <given-names>C.</given-names></name> <name><surname>Puppe</surname> <given-names>F.</given-names></name></person-group> (<year>2021</year>). <article-title>Experiments and detailed error-analysis of automatic square notation transcription of medieval music manuscripts using CNN/LSTM-networks and a neume dictionary.</article-title> <source><italic>J. New Music Res.</italic></source> <volume>50</volume> <fpage>18</fpage>&#x2013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1080/09298215.2021.1873393</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>W.</given-names></name> <name><surname>Wang</surname> <given-names>H.</given-names></name> <name><surname>Wu</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>Internal and external networks, and incubatees&#x2019; performance in dynamic environments: entrepreneurial learning&#x2019;s mediating effect.</article-title> <source><italic>J. Technol. Transf.</italic></source> <volume>46</volume>, <fpage>1707</fpage>&#x2013;<lpage>1733</lpage>. <pub-id pub-id-type="doi">10.1007/s10961-10020-09790-w</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>W.</given-names></name> <name><surname>Wang</surname> <given-names>H.</given-names></name> <name><surname>Zheng</surname> <given-names>C.</given-names></name> <name><surname>Wu</surname> <given-names>Y. J.</given-names></name></person-group> (<year>2019</year>). <article-title>Effect of narcissism, psychopathy, and machiavellianism on entrepreneurial intention&#x2014;the mediating of entrepreneurial self-efficacy.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>10</volume>:<issue>360</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2019.00360</pub-id> <pub-id pub-id-type="pmid">30846958</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Song</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>Gratifications for social media use in entrepreneurship courses: learners&#x2019; perspective.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>10</volume>:<issue>1270</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2019.01270</pub-id> <pub-id pub-id-type="pmid">31214081</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Wu</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <article-title>A decade of entrepreneurship education in the asia pacific for future directions in theory and practice.</article-title> <source><italic>Manag. Decis.</italic></source> <volume>55</volume> <fpage>1333</fpage>&#x2013;<lpage>1350</lpage>. <pub-id pub-id-type="doi">10.1108/MD-05-2017-0518</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y. J.</given-names></name> <name><surname>Liu</surname> <given-names>W. J.</given-names></name> <name><surname>Yuan</surname> <given-names>C. H.</given-names></name></person-group> (<year>2020</year>). <article-title>A mobile-based barrier-free service transportation platform for people with disabilities.</article-title> <source><italic>Comput. Hum. Behav.</italic></source> <volume>107</volume>:<issue>105776</issue>. <pub-id pub-id-type="doi">10.1016/j.chb.2018.11.005</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>Y. J.</given-names></name> <name><surname>Yuan</surname> <given-names>C. H.</given-names></name> <name><surname>Pan</surname> <given-names>C. I.</given-names></name></person-group> (<year>2018</year>). <article-title>Entrepreneurship education: an experimental study with information and communication technology.</article-title> <source><italic>Sustain. Basel</italic></source> <volume>10</volume>:<issue>691</issue>. <pub-id pub-id-type="doi">10.3390/su10030691</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>L. C.</given-names></name> <name><surname>Lerch</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>On the evaluation of generative models in music.</article-title> <source><italic>Neural. Comput. Appl.</italic></source> <volume>32</volume> <fpage>4773</fpage>&#x2013;<lpage>4784</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-018-3849-7</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yuan</surname> <given-names>C. H.</given-names></name> <name><surname>Wu</surname> <given-names>Y. J.</given-names></name></person-group> (<year>2020</year>). <article-title>Mobile instant messaging or face-to-face? Group interactions in cooperative simulations.</article-title> <source><italic>Comput. Hum. Behav.</italic></source> <volume>113</volume>:<issue>106508</issue>. <pub-id pub-id-type="doi">10.1016/j.chb.2020.106508</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zheng</surname> <given-names>W.</given-names></name> <name><surname>Wu</surname> <given-names>Y.</given-names></name> <name><surname>Chen</surname> <given-names>L.</given-names></name></person-group> (<year>2018</year>). <article-title>Business intelligence for patient-centeredness: a systematic review.</article-title> <source><italic>Telematics. Inf.</italic></source> <volume>35</volume> <fpage>665</fpage>&#x2013;<lpage>676</lpage>. <pub-id pub-id-type="doi">10.1016/j.tele.2017.06.015</pub-id></citation></ref>
</ref-list>
</back>
</article>