Skip to content
BY 4.0 license Open Access Published by De Gruyter Open Access February 8, 2023

Visual inspection intelligent robot technology for large infusion industry

  • Qilang Liang EMAIL logo and Bangshun Luo
From the journal Open Computer Science


The application of intelligent technology has realized the transformation of people’s production and lifestyle, and it has also promoted the development of the field of medicine. At present, the intensity of intelligence in the field of medicine is increasing. By using its cash methods and techniques combined with the mechanical field, this article proposes to use visual inspection technology to understand the fusion of the medical field and the mechanical field. It is helpful to analyze and solve objective problems such as low efficiency in current infusion and insufficient rigidity of large infusion plastic bottles. Drawing on the principles and laws of deep learning algorithms and neural networks, the technical research of intelligent robots for visual inspection is carried out to realize the intelligence of infusion robots. In the research accuracy of detection, the detection rate of standard particles higher than 85 µM has reached almost 100%, and the rate of 50 µM standard particles is lower and unstable. The detection effect of the control light bulb control was different, and the detection rate was between 50 and 80%, which was obviously worse than the detection robot effect. Therefore, the current research on the technology of intelligent robots is very important.

1 Introduction

The classification of drugs can be roughly divided into large infusions, oral liquids, syrups, water injections, powder injections, etc. Among them, there is only one large infusion, and China’s production capacity is about 10 billion bottles per year. At the same time, with the promotion of the Belt and Road Initiative, the market for large infusions is also increasing. However, due to the use of plastic bottles in the large infusion industry, there are two new problems with traditional glass bottles. On the one hand, the light transmittance of plastic is relatively weak, and on the other hand, the rigidity of plastic is worse than that of glass bottles, which leads to a severe challenge in the traditional method of using a light inspection machine. However, the software algorithm is difficult for the equipment of the light inspection machine for large infusions due to the replication of its mechanical structure. As a result, this industry is monopolized by European and American companies, such as the Italian Brevetti Company, the German Seddonald Company, and the American GF Company. The proposed R&D and industrialization of the visual inspection robot for the large infusion industry mainly includes three aspects: research on the motion control method and image fusion technology of visual robots with multiple light sources, research on the image calibration and classification technology based on defect description and quantitative definition, and research on the realization of an image detection and sorting control system based on transfer learning. We have conducted further research and discussions on the basis of the existing technology, which has certain reference significance. The quality of large infusion products has been improved, and labor costs and personnel instability factors have been reduced.

The research and development and industrialization of visual inspection robots for the large infusion industry are based on the weak permeability of large infusion plastic bottles and the poor rigidity of plastic bottles. The main research contents include three aspects: First, based on the weak permeability of large infusion plastic bottles and the poor rigidity of plastic bottles, the motion control method and image fusion technology of visual robots with multiple light sources are studied. The 660 nm laser parallel light source is used for irradiation, which effectively ensures the appearance of impurities in the sample. A 5 µm 90 kV X-ray is used for perspective detection, which effectively guarantees poor light transmittance. Based on infrared and visible light perspective and reflection modes, multimodal image acquisition is performed. The control system that integrates various means collects images separately and then performs fusion. The second is the research on image calibration and classification technology based on defect description and quantitative definition. After the defect description is based on the quality assessment image, the quantitative definition is carried out, and the software automation is used to complete the automatic labeling function. The diversity of industrial varieties determines that the annotation of images can only be completed by software automation. The ability of the software to automatically label can achieve an efficiency of 0.5 s and complete the automation ability. The third is to study the realization of image detection and sorting control system based on transfer learning. Image classification based on deep learning adopts the method of transfer learning. Under the condition of limited samples, how to use the trained model to retrain is an important idea of transfer learning. This article uses vision detection technology and algorithms to study the system framework and mechanics of a large infusion robot.

The current research on visual inspection robots is a hot topic. Among them, Li et al. proposed a new cognitive voice activity detection method for industrial inspection robots. Specifically, a constrained latent space is introduced to mimic human cognitive abilities, which represent abstractions learned from observed normal and abnormal data [1]. Mansouri et al. addressed the collaborative coverage path planning problem using multiple unmanned aerial vehicles to inspect complex infrastructure (offline 3D reconstruction) [2]. Fischer et al. reviewed the medicinal chemistry literature. The aim was to identify consistent principles for visual inspection, highlight examples of its successful application, and discuss its limitations [3]. The study by Mishra et al. was designed to evaluate the performance of cytology as a visual triage in women who screened positive for 4% acetic acid [4]. The purpose of the study by Balmer et al. was to describe the incidence of edema in dairy cows in the context of infusion. Determined by visual inspection, they proposed a visual scoring system to evaluate its repeatability and reproducibility [5]. However, due to many objective problems with equipment and data sources, the above research is only at the theoretical stage and cannot be practiced.

Using deep learning and neural network algorithms to conduct technical research on robots in visual inspection is a very novel idea, and many scholars are conducting research on it. Among them, Silkensen et al. simplified and improved cervical cancer screening using low-cost molecular detection and improved visual detection. VIA-based screening continues to provide a low-cost, single-visit screening method [6]. Kaichi et al. suggested that visual inspection is an important step in maintaining the quality of industrial parts. In visual inspection, the relationship between the three factors of the part’s camera pose, light direction, and normal vector is very important for detecting anomalies in industrial parts [7]. Qian and Luo aimed to investigate the effect of acute high-volume infusion on intraoperative hemodynamic fluctuations in patients with pancreatic robotic surgery during induction of general anesthesia [8]. Yu et al. proposed a vision-based detection system for the relocalization problem. Then they designed a dense network-based neural network structure to regress a 4-degree of freedom robot pose [9]. Engstrm and Strimling proposed that artificial intelligence in the form of visual inspection techniques diffuses into the medical field in a unique way compared to previous general techniques [10]. However, the above research only focuses on the design of robots, and the image semantic segmentation algorithm and the image fusion algorithm are not deeply understood.

The innovation of this article is as follows: (1) After dynamic detection based on laser, static detection based on X-ray, and reflection detection of visible light, multilight source control detection to collect images from different angles and then perform image fusion and analysis technology. (2) In view of the large variety of large infusions, after the quantitative definition based on image description, the automatic calibration of multi-variety images makes it possible to calibrate industrial images. (3) This article is based on the traditional infusion testing process, and the use of vision inspection technology combined with robotics has far-reaching implications for the medical field.

2 Overall Design of a Large Infusion Visual Inspection Robot

2.1 Recognition and Work of Visual Inspection Robot

According to the actual production situation of drugs, most foreign bodies are precipitated. In order to detect visible foreign objects, the detection robot will first rotate the large infusion through the bottle-wiping mechanism at high speed. Then, the detection platform is rotated to make the large infusion stop urgently. At this time, the impurities in the bottle will form a vortex that rotates around the center of the bottle [11]. Immediately after enabling the photographing signal of the industrial camera, the camera will continuously take several pictures of the bottle body. Since impurities move with the liquid, the images captured in this way are not affected by scratches or debris on the surface of the bottle. Finally, the industrial camera is transmitted to the computer image recognition processing software through the camera-integrated network port. The kick bottle signal is transmitted to the motion control system through the input output (IO) board, and then the motion control system separates the large infusion bottle. A schematic diagram is shown in Figure 1.

Figure 1 
                  Schematic diagram of the robot for detecting visible foreign bodies in large infusions.
Figure 1

Schematic diagram of the robot for detecting visible foreign bodies in large infusions.

Its main workflow is as follows:

  1. When the photoelectric sensor detects the medicine bottle to be detected on the conveyor belt, the large infusion medicine bottle to be detected will be sent to the detection platform at a certain distance through the input star-wheel structure. It rotates in a certain direction with the detection platform [12].

  2. When the detection platform rotates to the bottle cleaning station, the high-speed bottle cleaning motor rotates and rubs the large infusion medicine bottle to be inspected.

  3. When the large infusion medicine bottle rotates to the next station, the brake device will stop the bottle instantly. At this time, the programmable logic controller (PLC) sends a photographing instruction to the camera, and the camera starts to photograph the medicine bottle.

  4. The industrial camera transmits the continuously captured images to the industrial computer through a high-speed network card, and the image processing and analysis software in the industrial computer analyzes the images. It can judge whether the detected large infusion medicine bottle contains foreign matter and transmit the detection result of this station to the PLC controller through the IO card.

2.2 Overall structure of robot

A certain amount of medicine is delivered to the large infusion visible foreign body detection robot and enters the detection platform through the star gear mechanism. The bottle spins at high speed and then stops immediately after passing through the bottle wiping mechanism. The functions to be implemented by the system are shown in Figure 2.

Figure 2 
                  The overall structure of the large-scale infusion vision inspection robot.
Figure 2

The overall structure of the large-scale infusion vision inspection robot.

As can be seen from Figure 2, the large-scale infusion foreign body detection visual robot is divided into the following functional modules: mechanical transmission system, machine vision recognition system, electronic control system, computer image processing and analysis equipment, etc.

2.2.1 Mechanical transmission system

The main function of the mechanical transmission device is to realize the operations of large infusion bottle transmission, bottle separation, rotation, emergency stop, and rejection. It is mainly composed of a sealed body, a main rotating wheel, a conveying track, a bottle-separating star-wheel mechanism, a bottle-wiping mechanism, and a rejecting mechanism. The body of the detection robot is made of black lacquered light-shielding glass and stainless steel to ensure that the entire image recognition process is in a sealed environment, which greatly reduces the interference of external ambient light on system recognition [13]. The input star-wheel mechanism transfers the large infusion bottle to the detection platform, and the robot automatically compresses the large infusion bottle. At the same time, there is a bottle cleaning and braking mechanism on the detection platform. On the rear side of the output star wheel is the rejection mechanism, which automatically removes large infusion bottles containing foreign objects according to the structure of inter-process communication (IPC) image recognition processing.

2.2.2 Electrical control system

The large infusion visible foreign body detection robot is a high-tech equipment integrating computer, optical imaging, image processing, electromechanical control, and mechanical design. Its mechanical part, electrical control part, and visual inspection part must be coordinated with each other to form a whole [14,15]. The electrical control system completes the motor motion control of the mechanical transmission system, the control of the camera trigger signal, and the control of the pneumatic valve. These control signals are completed by the PLC, and IPC completes the image detection part. This not only reduces the operating burden of the PC but also improves the reliability and stability of the control system. Figure 3 is a block diagram of an electrical control system.

Figure 3 
                     Structure diagram of an electrical control system.
Figure 3

Structure diagram of an electrical control system.

It can be seen from Figure 3 that the human–machine operation interface adopts a touch screen, which is connected to the industrial computer, and then the industrial computer is connected to the communication port of the PLC controller through a string. The image acquisition and processing system transmits the processed structure to the input end of the PLC controller in the form of switch values through the IO board. PLC performs logic operations to control the rejecting mechanism of defective products to ensure normal operation of the rejecting mechanism [16]. In order to ensure the smooth operation of the detection platform, it is necessary to design a reasonable envelope motion curve. The inverter drives the conveyor belt motor to run, and the speed of the conveyor belt can be adjusted freely and smoothly. The control of the light source is mainly to ensure the service life of the light source and the imaging effect of the image. The foreign object detection robot adopts a static detection method. After the detection platform is in place, the light source is turned on, which prolongs the service life of the light source. At the same time, the brightness of the light source can be adjusted, the light intensity can be adjusted to the best imaging position, the interference is small, and foreign objects in the bottle can be photographed.

2.2.3 Cam clamping manipulator

The volume of the large infusion is relatively large, and it is easy to swing during the high-speed rotation. In severe cases, it will cause the detection object to fly out and damage the mechanical equipment. At the same time, it also affects subsequent image recognition. Therefore, a reasonable clamping device can be designed to ensure the stability of the detection process, as shown in Figure 4.

Figure 4 
                     Cam Clamping Manipulator with (a) the campaign holding a mechanical bottle state and (b) the cam holder holding the manipulator to loosen the state.
Figure 4

Cam Clamping Manipulator with (a) the campaign holding a mechanical bottle state and (b) the cam holder holding the manipulator to loosen the state.

As can be seen from Figure 4, the main function of the cam clamping manipulator is to lift the large infusion bottle before it enters the detection station. After entering the detection station, it falls down and compresses the large infusion medicine bottle to ensure that the bottle body will not vibrate when the bottle is wiped. Figure 4(a) shows the bottle-pressing state of the cam clamping manipulator, and Figure 4(b) shows the releasing state of the clamping manipulator. When the manipulator moves to the lower position of the slide rail, due to the action of the pressure spring, the pressure rod drives the pressure head to compress the large infusion bottle. When the gripping manipulator moves to a higher position on the slide rail, the pressure rod drives the pressure head to rise, and the pressure head releases the large infusion bottle to complete the pressing and loosening actions [17,18].

2.3 Image fusion and semantic segmentation algorithm analysis

2.3.1 Analysis of semantic image segmentation algorithm

The semantic image segmentation algorithm adds semantic labels (different colors for different kinds of objects) to the segmented image and adds labels to each kind of object in the segmented image. The input is usually a color-depth image.

Due to the complementary information between different spectra, multispectral images improve the robustness and accuracy of semantic segmentation methods to a certain extent. In order to intuitively show that there is a certain relationship between the category and the spectrum, this article calculates the spectral correlation to measure the information coupling between different spectra. This article calculates the spectral correlation of the two spectra in the k-th object region:

(1) c k = i = 1 N k ( x i x ¯ k ) ( y i y ¯ k ) i = 1 N k ( x i x ¯ k ) 2 i = 1 N k ( y i y ¯ k ) 2 ,

where x k and x k are spectral images of two bands, x ¯ k and y ¯ k are the mean of the responses within the k-th category area, respectively, and N k is the number of pixels in the k-th class region [19]. The lower the score, the lower the spectral correlation between the two spectra, the lower the information coupling and redundancy, and the lower the existence of complementary information.

By improving the similarity between pixels in the same category, the model can reduce the difference between categories and then obtain continuous and accurate segmentation results. Therefore, the category feature expression is obtained from the spectral feature, that is, the category–spectral relationship matrix, as shown in Figure 5.

Figure 5 
                     Schematic diagram of the category–spectral correlation module.
Figure 5

Schematic diagram of the category–spectral correlation module.

It can be seen from Figure 5 that, for the pixel P, the label corresponding to the maximum probability in its category channel is selected as the predicted category, l i . Therefore, this article calculates the category features as follows:

(2) X ˜ P = δ ( X h ) , X ˜ P R N H W ,

where N is the number of categories, and H and W are the length and width of the feature X i . Therefore, where the attention of the i-th pixel in class k is

(3) X k , i P = X k , j P j = 1 N X ˜ j , k P .

By equation (3), the image I is divided into N attention maps. In order to better learn the relationship between categories and spectra, in the training phase, this article adopts a supervised learning method and introduces a loss function L obj to make X p close to the semantic segmentation labels [20]. The existing channel attention method squeeze-and-excitation networks adopts global mean pooling, that is, the average of all pixel features of the s-th channel is taken as the feature m s of this channel. The specific process is as follows:

(4) m s = 1 H W i = 1 H W X s , i .

Therefore, it consists of spectral features corresponding to N categories. The soft-category mean is reduced to

(5) M k , s = i = 1 H W X s , i X i , k P i = 1 H W X i , k P .

The above methods extract common features from multi-spectral feature images by class-mean method to obtain the class-spectral relation matrix M, which is used as one of the input of the subsequent spectral channel enhancement module.

2.3.2 Analysis of image fusion effect based on X-ray

This article intends to use static X-ray image acquisition and dynamic laser acquisition combined with infrared mode for image fusion and recognition. Static X-ray is to solve the problem of poor permeability of large infusion plastic bottles, and the purpose of dynamic laser image acquisition is to solve the problem of slow rotation speed and low efficiency. The mode acquisition of infrared light is to solve the problem of bubble interference. The use of image analysis and fusion technology can maximize the detection of impurities while ensuring the efficiency of production [21,22]. At present, there are mainly two types of fusion rules based on pixels and regions. The following takes the fusion of two images as an example to introduce several commonly used fusion rules. Among them, A and B represent the original images, and F is the fusion result.

A weighted average of images A and B is as follows:

(6) F ( i , j ) = w A ( i , j ) + w B ( i , j ) .

Among them,

(7) w A + w B = 1 .

When w A = w B = 0.5 , the corresponding method is the average method, namely:

(8) F ( i , j ) = 1 2 ( A ( i , j ) + B ( i , j ) ) .

The two images A and B are compared one by one, corresponding to the pixel values of all positions, and the larger pixel value is obtained as the output of the pixel data fused at the position. The fused image F is obtained as follows:

(9) F ( i , j ) = max ( A ( i , j ) , B ( i , j ) ) .

Conversely, if a smaller pixel value is obtained as the pixel data output of the fusion position, the obtained image F after fusion is

(10) F ( i , j ) = min ( A ( i , j ) , B ( i , j ) ) .

Based on a variety of multi-angle data, through the research of different image analysis and fusion algorithms, a lot of computing power and graphic processing unit (GPU) computing power are required. High-performance computing platforms are more efficient at handling such problems due to their high-performance computing nodes and high-speed private networks.

In addition, high-performance computing has good software libraries for GPU acceleration that can be used to accelerate big data algorithms [23,24]. Therefore, the computing platform of this article intends to adopt a high-performance computing platform. The system architecture of the platform is shown in Figure 6.

Figure 6 
                     System architecture diagram of image fusion.
Figure 6

System architecture diagram of image fusion.

2.3.3 Adaptive weighted median image labeling algorithm

The difference between this algorithm and the traditional algorithm is that the size of the image annotation window is automatically determined according to the number of noises in the window. By moving a 3 × 3 window over the image, the noise pixels in the current window can be determined. Assuming that the gray value of the center pixel (i, j) is f(i, j), the gray value of all pixels in the window can be represented by the following set:

(11) S i , j = { f ( i + k , j + r ) k r = 1 , 0 , 1 } .

Average of all pixels within the window:

(12) Average ( S i , j ) = 1 9 k = 1 l r = 1 1 f ( i + k , j + r ) .

Therefore, noise sensitivity, defined based on human visual characteristics, can be defined as follows:

(13) d i , j = 1 3 k = 1 1 r = 1 1 f ( i , j ) Average ( S i , j ) 2 .

After the marking of noise pixels is completed, the number of noise pixels can be determined adaptively to determine the size of the image marking window. The equation is as follows:

(14) Num ( S i , j ) = k = 1 1 r = 1 1 N ( i + k , j + r ) .

According to the calculation result, the size of the image annotation window is automatically selected according to the following calculation formulas:

(15) L i , j = 3 3 Num ( S i , j ) { 1 , 2 , 3 }

(16) L i , j = 5 5 Num ( S i , j ) { 4 , 5 , 6 }

(17) L i , j = 7 7 Num ( S i , j ) { 7 , 8 , 9 } .

Then, calculate the similarity of the pixels (i + k, j + r) in the image annotation window as follows:

(18) simila ( i + k , j + r ) = 1 1 + ( f ( i + k , j + r ) f ( i , j ) ) 2 .

It can be seen from the above algorithm flow that some subdirectories are established when the image database is established, and then the recursive algorithm idea is used to continuously search for different images [25]. Given the type of the original loaded image, the image is cyclically loaded through recursive matching of the tree structure model. The loading image condition and type interface are set, and the image file type is input, so as to realize the continuous loading of the target image.

3 Algorithm and system of visual inspection robot

3.1 Image semantic segmentation algorithm

This article uses the image semantic segmentation technique in vision detection technology to systematically test the performance of a large infusion robot. By making statistical decisions on the number of Blobs in the region of interest area, the positive and defective products of the detected objects can be preliminarily determined. To this end, this article will compare and test the back propagation neural network (BPNN) segmentation algorithm based on the time spectrum with manual detection, the difference algorithm, the quadratic difference algorithm, and the difference pulse coupled neural network algorithm. The test objects are 1,000 genuine products, 500 glass foreign body samples, 500 hair foreign bodies, 500 color spot foreign bodies, 300 aluminum scrap foreign bodies, and 200 other uncertain foreign bodies. The test results are shown in Table 1.

Table 1

Comparison of test results from different methods

Measurement methods Genuine Glass Hair Color point Other
Artificial inspection Precise 962 482 461 423 177
Misunderstanding 35 5 25 65 5
Differential algorithm Precise 940 474 464 460 165
Misunderstanding 21 22 33 38 7
Algorithm Precise 960 481 471 441 183
Misunderstanding 15 14 26 18 5

It can be seen from Table 1 that the above algorithms can detect the most genuine and defective foreign objects. Manual and visual inspections have low detection rates for fine-sized color spots and hair, and foreign bodies larger than 50 µm in diameter are insoluble and visible foreign bodies. The vision for judging good products is superior to manual inspection. The manual inspection process needs to turn the bottle over, and the bubble interference is serious. The comparison shows that the detection rate of the proposed algorithm for genuine and defective products is better than that of other algorithms. The detection results are more stable than manual ones [26]. Table 2 gives the processing times of different algorithms.

Table 2

Comparison of the time consumption of different detection methods

Image frame number 10 Frames 20 Frames
Artificial inspection (ms) 5,000 5,000
Direct difference (ms) 104.38 184.91
Secondary difference (ms) 244.72 462.54
Algorithm (ms) 305.17 686.29

It can be seen from Table 2 that different algorithms are greatly affected by the number of sequence frames. In order to compress the processing time, the BPNN in this algorithm reduces the dimension of the input vector by reducing the intermediate useless frames, taking into account the processing time and information effectiveness.

3.2 Experimental results

“Chinese Pharmacopoeia” Part II Annex IXH clearly stipulates the “visible insoluble foreign matter” in pharmaceuticals: the detection particle size of insoluble foreign matter is 50 µm, that is, the national testing standard for insoluble foreign matter in injections is 50 µm. However, in the process of manual detection, due to the influence of human eyesight level, distraction, subjective factors, and detection speed requirements, manual detection cannot 100% detect defects above 50 µm, and there is a certain rate of missed detection and false detection. The principle of manual detection is different from that of equipment detection. The effect of product detection may be different, and the corporate standards for foreign body detection in pharmaceutical companies are mostly manual standards. Therefore, the equipment in this article has been manually compared and tested according to the Knapp–Kushner test method recognized by the international European Pharmacopoeia and the U.S. Food and Drug Administration pharmaceutical industry. Its core idea is to eliminate the differences between decision-makers through statistical methods and get the most confident results. From the perspective of statistical probability, this method gradually evaluates a single detection target in a group in batches and across multiple dimensions, collecting all the results. Finally, the true level of the target is given according to the probability distribution. The Knapp–Kushner test process is as follows:

  1. Sample preparation: The products on the production line can be manually inspected, and a large number of defective products can be sorted out. The defective products are classified according to the difficulty of detection and the type of defect, such as glass shavings, hair, fibers, and color spots, and a large number of candidate samples are obtained through sample collection in several shifts.

  2. Knapp test sample screening: 100 defective samples were selected from the collected candidate samples. Among them, the samples should cover different types of defects, different particle sizes of defects, and different detection difficulties, so as to cover the real situation of the product as much as possible. In addition, 300 untested samples and 100 samples were randomly mixed and numbered from the production line.

Through the testing of 400 defective samples, the quality factor data obtained by manual testing are shown in Table 3, and the quality factor table obtained by machine testing is shown in Table 4.

Table 3

Manual inspection quality factor table

Bottle number Screw Bottle editing Screw
1 3 321 9
2 8 322 8
3 9 323 1
4 1 324 1
5 0 325 5
Table 4

Robot detection quality factor table

Bottle number Screw Bottle editing Screw
1 8 321 10
2 10 322 6
3 10 323 6
4 0 324 7
5 2 325 0

3.3 Detection performance

3.3.1 Equipment detection accuracy test

Take S > 7 as the dividing line for manual inspection of positive and defective products, take factory quality assurance-i to represent the manual inspection quality factor of the i-th sample, and use FQBi to represent the machine inspection quality factor of the i-th sample. Calculate the sum of the quality factors of all manually detected samples with S > 7. Therefore, if the machine detects the comprehensive FQB of the quality factor of S > 7 samples, the efficiency ratio is calculated as follows:

(19) η = F Q B S > 7 F Q A S > 7 100 % .

The detection accuracy of the equipment is verified by standard particles of known diameter, and standard particles of 50, 85, 100, and 500 µm are canned into injections, and 1,000 pieces of each particle size are filled. Among them, 50 passed the injection visual inspection equipment test, and the test results are shown in Figure 7.

Figure 7 
                     The detection rate of standard particles by the visual inspection robot.
Figure 7

The detection rate of standard particles by the visual inspection robot.

It can be seen from Figure 7 that the detection rate of standard particles above 85 µm is almost 100%, and the detection rate of 50 µm standard particles is low and unstable. At the same time, the 50 µm standard particle sample passed the manual light inspection, and the inspection results of different light inspectors were different. The detection rate is between 50 and 80%, which is significantly worse than the detection robot effect.

3.3.2 Detection performance test of various defects on the production line

Manual light inspection is carried out by skilled inspection workers, accumulating 10,000 unqualified products. At the same time, 10,000 qualified products are prepared for testing the missed detection rate and false detection rate of the equipment in the actual production environment. The experimental data are shown in Figure 8.

Figure 8 
                     Statistics of the missed detection rate and false detection rate of the visual inspection robot.
Figure 8

Statistics of the missed detection rate and false detection rate of the visual inspection robot.

It can be seen from Figure 8 that the probability of genuine equipment being detected as genuine is more than 98%, and the missed detection rate of defective products is about 1.5%. Among them, manual secondary reinspection was carried out on the missed products, and it was found that the missed products were defective with weak size and looming appearance, and manual inspection was also very easy to miss.

3.3.3 Consistency test for long-term use of the machine

Testing equipment is also considered a testing instrument. Its repeatability, accuracy, detection range, and other indicators must have strict consistency, which is not affected by switching on and off, operators, incoming materials, or detection time. To this end, the Gauge Repeatability and Reproducibility (GRR) stability test was carried out on the detection equipment to verify its long-term repeatability and reproducibility. First of all, the experimental preparation can reflect 1000 samples of the most conventional insoluble foreign bodies and 1000 real samples from the production process, which can be used as samples for long-term testing. Second, the sample test is carried out in 10 days. The test conditions are that the machine must be powered on and restarted each time, taking into account the influence of operations such as turning on and off the machine. It is necessary to test once a day and repeat the test nine times on 2,000 samples for each test to form a test report. Statistical reports should be formed and observed to determine whether the device detection rate satisfies a normal distribution. Finally, according to the conditional requirements of the GRR test, batch verification and data statistics are carried out. The statistical chart is formed as shown in Figure 9.

Figure 9 
                     Vision inspection robot GRR test.
Figure 9

Vision inspection robot GRR test.

Figure 9 shows the data detection on day 1 and 10. It can be clearly seen from the statistical chart that the detection rate of genuine and defective products has remained above 98%. The repeatability and stability of machine detection are minimally affected by external factors, which meet the long-term stability index of instruments and equipment. Through a series of tests on the visual inspection robot, such as Knapp–Kushner test, detection accuracy test, various defect detection performance tests, and GRR stability test, it is verified that the developed pharmaceutical foreign body visual inspection robot can meet the production needs of pharmaceutical companies and can completely replace labor. Various enterprises have given extremely high evaluations to the testing equipment, which not only reduces testing costs but also improves testing standards. A comparison of research on vision-based I.V. robots with traditional infusion robots shows that vision-based I.V. robots are more efficient and effective.

4 Conclusion

This article compares the level of existing detection technology. The necessity, importance, and market promotion value of developing a fully automatic medical foreign body visual inspection robot are pointed out. This article focuses on many technical problems faced in the development of detection robots, such as various types of foreign objects, complex optical imaging, and high requirements for detection efficiency. The purpose of this article is to solve the problem of low transmittance of plastic for large infusions. It is proposed to use static methods and X-ray penetration methods to obtain images of impurities or foreign objects. Aiming at the problem of the insufficient rigidity of large infusion plastic bottles, it is proposed to use the method of robot flexible control to complete the task. The design adopts the deep learning method for problems that are difficult to change. After different types of training are stored, automatic image recognition is used to complete the automatic type change. Aiming at the problem of low efficiency, the dynamic detection of laser and the static combination of X-ray are adopted to improve the detection performance and the detection efficiency. It has built robots to replace employees, improved efficiency significantly, and realized the leader of intelligent detection in the large infusion industry. The thesis completes the research and can meet the detection robot used by pharmaceutical companies. However, there are still some technical and scientific research works that require long-term research to further develop the visual inspection robot of the flight visual image acquisition scheme. The existing detection robot’s main roulette and tracking roulette collect pictures during the synchronization process, which has extremely high requirements on the equipment. In theory, by establishing the motion model of the active roulette, the shooting angle of the camera and the position change of the glass bottle in the image can be accurately calculated. Based on this, the sequence images can be restored in three-dimensional space, and the sequence images can be registered and detected. The detection scheme based on flight photography can greatly reduce mechanical jitter and improve detection speed. Although some aspects of this article are somewhat flawed due to insufficient data sources and immature data acquisition techniques, this article still advances the research of large infusion robots in the context of vision inspection technology.

  1. Funding information: This work was supported by Hainan Provincial Natural Science Foundation of China (No. 620RC670).

  2. Conflict of interest: There are no potential competing interests in our article. All authors have seen the manuscript and approved to submit to your journal. We confirm that the content of the manuscript has not been published or submitted for publication elsewhere.

  3. Data availability statement: Data sharing is not applicable to this article as no datasets were generated or analyzed during this study.


[1] J. Li, X. Xu, L. Gao, and J. Shao, “Cognitive visual anomaly detection with constrained latent representations for industrial inspection robot,” Appl. Soft Comput., vol. 95, no. 2, pp. 106539–106540, 2020.10.1016/j.asoc.2020.106539Search in Google Scholar

[2] S. S. Mansouri, C. Kanellakis, E. Fresk, D. Kominiak, and G. Nikolakopoulos, “Cooperative coverage path planning for visual inspection,” Control. Eng. Pract., vol. 74, no. may. pp. 118–131, 2018.10.1016/j.conengprac.2018.03.002Search in Google Scholar

[3] A. Fischer, M. Smiesko, M. Sellner, and M. A. Lill, “Decision making in structure-based drug discovery: Visual inspection of docking results,” J. Med. Chem., vol. 64, no. 5, pp. 2489–2500, 2021.10.1021/acs.jmedchem.0c02227Search in Google Scholar PubMed

[4] G. A. Mishra, S. A. Pimple, and S. D. Gupta, “Evaluation of cytology as secondary triage in visual inspection after application of 4% acetic acid-based cervical cancer screening program,” South Asian J. Cancer, vol. 8, no. 2, pp. 102–107, 2019.10.4103/sajc.sajc_50_18Search in Google Scholar PubMed PubMed Central

[5] M. Balmer, M. Alsaaod, M. Boesiger, R. O. Brien, and A. Steiner, “Technical note: Evaluation of a sonographic overbagging edema scoring system for show cows: Comparison with visual inspection,” J. Dairy. Sci., vol. 101, no. 8, pp. 7494–7499, 2018.10.3168/jds.2018-14462Search in Google Scholar PubMed

[6] S. L. Silkensen, M. Schiffman, V. Sahasrabuddhe, and J. S. Flanigan, “Is it time to move beyond visual inspection with acetic acid for cervical cancer screening? Glob. Health Sci. Pract., vol. 6, no. 2, pp. 242–246, 2018.10.9745/GHSP-D-18-00206Search in Google Scholar PubMed PubMed Central

[7] T. Kaichi, S. Mori, H. Saito, J. Sugano, and H. Adachi, “Visual inspection by capturing a rotating industrial part,” J. Jpn. Soc. Precis. Eng., vol. 83, no. 12, pp. 1184–1191, 2017.Search in Google Scholar

[8] Y. Qian and Y. Luo, “Effect of acute hypervolemic fluid infusion during anesthesia induction on intraoperative hemodynamics in the patients undergoing Da Vinci robot-assisted pancreatic surgery,” J. Shanghai Jiaotong Univ. (Med. Sci.), vol. 39, no. 1, pp. 73–78, 2019.Search in Google Scholar

[9] S. Yu, F. Yan, W. Yang, X. Li, and Y. Zhuang, “Deep-learning-based relocalization in large-scale outdoor environment,” IFAC-PapersOnLine, vol. 53, no. 2, pp. 9722–9727, 2020.10.1016/j.ifacol.2020.12.2628Search in Google Scholar

[10] E. Engstrm and P. Strimling, “Deep learning diffusion by infusion into preexisting technologies – Implications for users and society at large,” Technol. Soc., vol. 63, no. 3, pp. 101396–101397, 2020.10.1016/j.techsoc.2020.101396Search in Google Scholar

[11] G. Wang, W. Liu, A. Wang, K. Bai, and H. Zhou, “Design and experiment on intelligent reseeding devices for rice tray nursing seedling based on machine vision,” Nongye Gongcheng Xuebao/Transactions Chin. Soc. Agric. Eng., vol. 34, no. 13, pp. 35–42, 2018.Search in Google Scholar

[12] K. T. Clebak, L. Helm, and M. Helm, “Accuracy of dermoscopy vs. visual inspection for diagnosing melanoma in adults,” Am. Family Physician, vol. 101, no. 3, pp. 145–146, 2020.Search in Google Scholar

[13] H. Koshimizu, K. Aoki, T. Funahashi, Y. Miwata, and H. Ishi, “Modeling of human inspection mechanism for instrumentation of visual inspection in production line,” J. Jpn. Soc. Precis. Eng., vol. 83, no. 2, pp. 116–120, 2017.10.2493/jjspe.83.116Search in Google Scholar

[14] L. John, “Vision-guided quadruped robot from Boston Dynamics now opens doors,” Vis. Syst. Des., vol. 23, no. 4, pp. 8–8, 2018.Search in Google Scholar

[15] K. Yamazaki, “Robot vision applications using convolution for image processing,” J. Robot. Soc. Jpn., vol. 35, no. 9, pp. 644–647, 2017.10.7210/jrsj.35.644Search in Google Scholar

[16] W. Jin, W. Lin, X. Yang, and H. Gao, “Reference-free path-walking method for ball grid array inspection in surface mounting machines,” IEEE Trans. Ind. Electron., vol. 64, no. 8, pp. 6310–6318, 2017.10.1109/TIE.2017.2682008Search in Google Scholar

[17] Q. Luo, X. Fang, L. Liu, C. Yang, and Y. Sun, “Automated visual defect detection for flat steel surface: A survey,” IEEE Trans. Instrum. Meas., vol. 69, no. 3, pp. 626–644, 2020.10.1109/TIM.2019.2963555Search in Google Scholar

[18] M. Ren, X. Wang, G. Xiao, M. Chen, and L. Fu, “Fast defect inspection based on data-driven photometric stereo,” IEEE Trans. Instrum. Meas., vol. 68, no. 4, pp. 1148–1156, 2019.10.1109/TIM.2018.2858062Search in Google Scholar

[19] Y. Peng and C. Xiao, “An oriented derivative of stick filter and post-processing segmentation algorithms for pulmonary fissure detection in CT images,” Biomed. Signal. Process. Control., vol. 43, no. MAY. pp. 278–288, 2018.10.1016/j.bspc.2018.03.013Search in Google Scholar

[20] O. O. Karadag, C. Senaras, and F. Vural, “Segmentation fusion for building detection using domain-specific information,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 8, no. 7, pp. 3305–3315, 2017.Search in Google Scholar

[21] C. Zheng, P. Chen, J. Pang, X. Yang, and Y. Xue, “A mango picking vision algorithm on instance segmentation and key point detection from RGB images in an open orchard,” Biosyst. Eng., vol. 206, no. 6, pp. 32–54, 2021.10.1016/j.biosystemseng.2021.03.012Search in Google Scholar

[22] P. Skelton, A. Finn, and R. Brinkworth, “Consistent estimation of rotational optical flow in real environments using a biologically-inspired vision algorithm on embedded hardware,” Image Vis. Comput., vol. 92, no. Dec. pp. 103814.1–103814.13, 2019.10.1016/j.imavis.2019.09.005Search in Google Scholar

[23] M. Gupta and P. Kumar, “Robust neural language translation model formulation using Seq2seq approach,” Fusion Pract. Appl., vol. 5, no. 2, pp. 61–67, 2021.10.54216/FPA.050203Search in Google Scholar

[24] O. I. Khalaf, C. A. T. Romero, A. Azhagu Jaisudhan Pazhani, and G. Vinuja, “VLSI implementation of a high-performance nonlinear image scaling algorithm,” J. Healthc. Eng., 2021. Article ID 6297856, 10 pages, 2021.10.1155/2021/6297856Search in Google Scholar PubMed PubMed Central

[25] R. Karthika and L. Parameswaran, “An automated vision-based algorithm for out of context detection in images,” Int. J. Signal. Imaging Syst. Eng., vol. 11, no. 1, pp. 1–8, 2018.10.1504/IJSISE.2018.090601Search in Google Scholar

[26] P. J. Putney, “Weed and crop discrimination through an offline computer vision algorithm,” ELAIA, vol. 1, no. 1, p. 23, 2018.Search in Google Scholar

Received: 2022-08-01
Revised: 2022-10-03
Accepted: 2022-11-02
Published Online: 2023-02-08

© 2023 the author(s), published by De Gruyter

This work is licensed under the Creative Commons Attribution 4.0 International License.

Downloaded on 29.3.2023 from
Scroll Up Arrow