Fuzzy Theory-based Air Valve Control for Auto-Score-Recognition Soprano Recorder Machines
- 10.2991/jrnal.k.211108.010How to use a DOI?
- Auto-score-recognition soprano recorder machines (ASRSRM); LabVIEW; Arduino; fuzzy theory-based air valve control; pneumatic cylinder
In the past research, there are many disadvantages to score recognition and flute performance. In view of this, we will improve the above disadvantages in this article. First, for the music score recognition, a y-axis projection method is used to detect the staff position and eliminate it to replace the erosion and expansion in morphology. This feature can be used to distinguish the notes, which have a specific writing style on the staff. For the soprano recorder playing, in the past we used finger-shaped electric arms to press the blow hole to cause that the speed of the score cannot be kept up. To improve this drawback, the motor is changed to a solenoid valve to facilitate the pneumatic cylinder to smoothly press the blow hole. In addition, since the difference in pitch of the soprano recorder requires different air pressure, we increase one valve to three valves. Moreover, the range is divided into bass, midrange, and treble. Not only that, fuzzy theory-based air valve control is applied to auto-score-recognition soprano recorder machines to greatly improve the sound distortion caused by the original single air valve. Experiments prove that the fuzzy theory-based air valve control combined with sheet music recognition techniques can fully realize the functions of autoplaying soprano recorder machines.
- © 2021 The Author. Published by Atlantis Press International B.V.
- Open Access
- This is an open access article distributed under the CC BY-NC 4.0 license (http://creativecommons.org/licenses/by-nc/4.0/).
Nowadays, robots are no longer just used in industrial production, and they can also be seen in medical and artistic fields. The International Robotic Art Competition, which started in 2016, has also been held three times. Many works created by robots have reached a level comparable to human artists. As for performance robots, there have been significant advances and improvements due to artificial intelligence. Among them, the research on music score recognition technology has been proposed in many documents. Although the music notation is limited and there are restrictions on the writing position on the staff. However, there are still many blind spots to be overcome and improved in identifying it with artificial intelligence technology, even if it is only for non-handwritten scores.
Lee et al.  proposed a two-wheeled robot that can autonomously read music and sing songs with vocal voice. The disadvantage is that the recognized scores are only digital scores. Hong  proposed mixed media to draw portraits with hand-drawn style. The input image undergoes different image processing steps to extract these features and combine them to form an Non-Photorealistic Rendering (NPR)-style image. Due to only FIVE colors are used to mix and modulate the colors, the error in color toning is about 10%. Chang et al.  use optical symbol recognition. Since it must build a symbol database, it is possible to recognize errors as long as there are slight errors. Xiao et al.  presented a real-time optical music recognition system for a dulcimer musical robot. Since the note groups are only decomposed into three fundamental elements, it is easy to cause inaccurate recognition of certain notes. Wang and Chen  proposed a musical score recognition system for iOS devices, because its recognition process requires human assistance, so there is no way to achieve the effect of automatic recognition.
Based on the shortcomings of the above-mentioned scholars, this article proposes fuzzy theory-based air valve control combined with sheet music recognition techniques to improve the recognition of music scores and improve the phenomenon of out of sound. Experiments prove that presented method can fully realize the functions of Auto-Score-Recognition Soprano Recorder Machines (ASRSRM).
To improve the problem that the finger-shaped robotic arms cannot keep up with the rhythm of the music, all motors are changed to a solenoid valve and combined with the Arduino UNO board to control the air cylinder to press the blow hole. The finished ASRSRM is shown in Figure 1.
2.1. Control Systems
The Arduino UNO board is used as the control core of the ASRSRM. LINX in LabVIEW Graphical Programming environment is the bridge between LabVIEW program and Arduino UNO board. Like LabVIEW Interface for Arduino (LIFA), LINX must first write a set of communicable commands in Arduino. The difference is that LIFA can provide users with the ability to directly use Arduino connection operation in LabVIEW without writing Arduino code. For other details, please refer to Wang and Jhang . The control system architecture diagram is shown in Figure 2. Figure 3 is the Arduino UNO control board.
2.2. ASRSRM Platform
The physical architecture of the ASRSRM platform is shown in Figure 4. The main feature is that the part of the original flute that presses the blow hole is pressed by the solenoid valve controlled pneumatic cylinder. The improved machine improved the original inability to play music faster than the response speed of the motor significantly. Not only that, the blowing control part has two more solenoid valves. It solves the problem of broken sound when blowing. To make blowing more smooth and close to the mode of human playing the flute, we have increased the original three solenoid valves to nine in the new ASRSRM platform, as shown in Figure 5.
3. MUSIC SCORE RECOGNITION SYSTEM
First, we eliminate the parts of the score that are not related to performance, as shown in Figure 6. The processed music score is input into the system. After that, binarization and scale recognition are performed through LabVIEW. Figure 7 is its program flow chart.
First, the note stems are distinguished and eliminated by the x-axis projection detection method. Second, the center point of the head of the note and the position of the staff are identified by the pixel clustering method. Finally, the encoding of the scale is arranged in sequence from bass to treble. For other details, please refer to Wang and Jhang .
Binarization is to divide the grayscale image into black and white according to the threshold set by the user. When the grayscale value of the pixel is greater than the threshold, it is judged as a white point, otherwise it is a black point. The grayscale image can be converted into a binary image through binarization. For other details, please refer to Wang and Jhang .
3.2. Eliminate Staff
We use the y-projection in the orthogonal projection method to project the music score to be identified onto a single axis, which will form a graph called the projection profile, as shown in Figure 8. We can clearly find the position of the staff and eliminate it. This method greatly improves the original use of erosion and swelling in typology to cause unclear symbols, as shown in Figure 9.
3.3. Symbol Distinction
For the introduction of rests, please refer to Wang and Jhang . Generally, the height of most rests is not greater than the height of the notes. Therefore, the notes are preliminarily divided into rest notes and general notes to facilitate subsequent identification in this paper. To distinguish between continuous notes and discontinuous notes, the image is projected on the x-axis using the x-projection in the orthogonal projection method. The note stems exceeding a certain value are eliminated, as shown in Figure 10. Besides, the pixel clustering method will be used to distinguish the following three types: discontinuous notes, continuous notes, and discontinuous notes but with tails, as shown in Figure 11.
The pixel clustering method uses the IMAQ Count Objects 2 VI component in the Vision development module in LabVIEW. The function of this component is to cluster the binarized pixels. Let the user set the pixel threshold to distinguish the number of pixel groups. Figure 12 is the result of pixel cluster identification.
3.4. Scale Recognition
While the scale recognition is performed, the note timing is also recognized. Therefore, the head and tail of the note must be distinguished first. First, the discontinuous notes have been separated out after removing the note stem, so there is no need to deal with it. The part with discontinuous notes but with tails is distinguished by the proportion of black pixels in the image extracted by pixel clustering. The pixel ratio calculation is based on the range enclosed by the red frame to calculate the image size and the ratio of black pixels, as shown in Figure 13. Continuous notes cannot be identified by this method because the proportions of black pixels of note stems and note heads are too close. Therefore, we use the aforementioned note stems to account for more than two-third of the overall note width to distinguish. The position of the beam will change due to the way the music theory is written. So we divide the continuous note from the center point into the upper and lower parts, as shown in Figure 14.
For judgment of discontinuous notes, the difference between discontinuous notes is whether the beam and the head are hollow. We use the ratio of black pixels in the red box to distinguish half notes from quarter notes, as shown in Figure 15. Because we have already recorded the position of the staff while removing the staff. Therefore, we only need to extract the center point of the talisman through the pixel clustering, and then compare it with the previously recorded staff position to know which line or interval the talisman is located on. Put it into the scale table of the recorder to get the scale of the note, as shown in Figures 16 and 17.
3.5. Rest Identification
The identification of rests is double identification using orthogonal projection method and pixel clustering. According to the appearance of commonly used rests, they are divided into three types: (1) whole rest and half rest, (2) quarter rest, (3) 8th rest and 16th rest. First, we use the pixel clustering method to capture the rest image, and use the black pixel ratio in the red frame to make a preliminary judgment, as shown in Figure 18. Second, make a detailed distinction for the above three types. (1) The image coordinates of the y-axis projection of the full rest and the bipartite rest are captured and compared with the staff position. If the image coordinates are closer to the fourth line of the staff, it is judged as a whole rest. If the coordinates are closer to the third line, it is a half rest. (2) Compare the height value captured after y-axis projection with the highest point value after x-axis projection. If the values are similar, it is judged as a quarter rest, as shown in Figure 19. (3) The 8th rest has only one peak in the y-axis projection image. The 16th rest has two peaks, as shown in Figures 20 and 21.
4. FUZZY THEORY-BASED AIR VALVE CONTROL DESIGN
To provide a smoother air pressure with different sound ranges and close to the mode of human playing the flute, this article increases the number of solenoid valves to nine. Besides, the opening and closing gap of each solenoid valve is different. In this paper, a fuzzy theory-based control law is applied to write the valve control program. The range (R) is divided into small bass (SB), medium bass (MB), high bass (HB), small midrange (SM), medium midrange (MM), high midrange (HM), small treble (ST), medium treble (MT), and high treble (HT), as shown in Figure 22. Let U be the sound range and V be solenoid air valves. The membership function of the range is designed in Mamdani type. The membership function of the output (solenoid air valves) is designed in Takagi Sugeno type. The design of Fuzzy IF-THEN rules are expressed as follows.
If U is SB, then V is V1.
If U is IB, then V is V2.
If U is HB, then V is V3.
If U is SM, then V is V4.
If U is IM, then V is V5.
If U is HM, then V is V6.
If U is ST, then V is V7.
If U is IT, then V is V8.
If U is HT, then V is V9.
Among them, the solenoid air valve opening and closing gaps of V1–V9 are calculated by the number of turns. Their values are V1 = 1.1, V2 = 1.25, V3 = 1.5, V4 = 2, V5 = 2.25, V6 = 2.5, V7 = 2.75, V8 = 2.85, V9 = 2.95, respectively.
5. EXPERIMENTAL RESULTS
In view of the large differences in the way of notation for tuplets (Figure 23a) and dotted notes (Figure 23b) by musicians around the world, it is impossible to effectively identify all these two symbols with a general rule. Therefore, this article will ignore it during the experiment. In addition, the score used for testing is taken from a web site made by netizens and provided free of charge [7,8]. The scores we used to test included five Mandarin pop songs, two Japanese pop songs, and three movie theme songs. Please refer to the following URL directly for the experimental results.
This paper uses the pixel clustering method, the x- and y-axis projection method to successfully improve the score recognition results. Moreover, the finger-shaped electric arms are changed to solenoid valves to facilitate the pneumatic cylinder to smoothly press the blow hole. This method has also successfully improved the phenomenon of air leakage. Not only that, fuzzy control theory is used to control the nine solenoid air valves to greatly improve the sound distortion caused by the original single air valve. Experimental results prove that fuzzy theory-based air valve control can fully realize the functions of auto-score-recognition soprano recorder machines.
CONFLICTS OF INTEREST
The author declares no conflicts of interest.
Dr. Chun-Chieh Wang
He received the PhD degree in Institute of Graduate School of Engineering Science and Technology from National Yunlin University of Science & Technology, Yunlin, Taiwan, in 2004. He is currently a Professor of Department of Automation Engineering and Institute of Mechatronoptic Systems of Chienkuo Technology University. His areas of research interest include robotics, image detection, electromechanical integration, innovative inventions, long-term care aids, and application of control theory. He is now a permanent member of Chinese Automatic Control Society (CACS). He is also a permanent member of Taiwan Society of Robotics (TSR).
Cite this article
TY - JOUR AU - Chun-Chieh Wang PY - 2021 DA - 2021/12/27 TI - Fuzzy Theory-based Air Valve Control for Auto-Score-Recognition Soprano Recorder Machines JO - Journal of Robotics, Networking and Artificial Life SP - 278 EP - 283 VL - 8 IS - 4 SN - 2352-6386 UR - https://doi.org/10.2991/jrnal.k.211108.010 DO - 10.2991/jrnal.k.211108.010 ID - Wang2021 ER -