يعرض 41 - 50 نتائج من 1,619,580 نتيجة بحث عن '"Computer vision"', وقت الاستعلام: 1.34s تنقيح النتائج
  1. 41
    دورية أكاديمية

    المؤلفون: Djalab, Abdelhak1, Lalaoui, Lahouaoui1 lahouaoui.lalaoui@univ-msila.dz, Bisker, Aya1, Hadibi, Aicha1

    المصدر: Traitement du Signal. Feb2024, Vol. 41 Issue 1, p383-390. 8p.

    مستخلص: In the field of computer vision, image classification stands as a pivotal task, aiming to categorize images based on their inherent visual information. This paper presents an innovative hybrid approach, merging the strengths of Convolutional Neural Networks (CNNs) and Hidden Markov Models (HMMs) to enhance the efficacy of image classification. The integration of these two methodologies, each excelling in distinct aspects of data analysis, forms the cornerstone of our research. CNNs, renowned for their proficiency in extracting spatial data and fine-grained features, are adept at generalizing across diverse datasets. Conversely, HMMs, with their robust sequential data modeling capabilities, adeptly capture dependencies within the feature sets derived from CNNs. This synergy is embodied in the HMM-CNN framework, wherein CNNs serve to extract pertinent features from images, while HMMs model the spatial dependencies between adjacent pixels. Empirical evaluations on benchmark datasets substantiate the superior performance of this hybrid approach over traditional CNNs, particularly in scenarios where temporal dependencies are paramount, such as video analysis, action recognition, and gesture classification. A comparative analysis employing five datasets and six metrics-recall, precision, val_loss, val_accuracy, val_precision, and val_recall-reveals the superiority of the CNN-HMM model. Specifically, against a standalone CNN model with an accuracy of 87%, the CNN-HMM model demonstrates an accuracy of approximately 89.09%. This paper's findings underscore the efficacy of combining CNN and HMM methodologies for advanced image classification tasks, offering significant implications for future research in this domain. [ABSTRACT FROM AUTHOR]

    : Copyright of Traitement du Signal is the property of International Information & Engineering Technology Association (IIETA) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  2. 42
    دورية أكاديمية

    المؤلفون: Ornek, Ahmet Haydar1,2 ahmet.haydar.ornek2@huawei.com, Ceylan, Murat2

    المصدر: Traitement du Signal. Feb2024, Vol. 41 Issue 1, p63-71. 9p.

    مستخلص: Deep learning models are proficient at predicting target classes, but they need to explain their predictions. Explainable Artificial Intelligence (XAI) offers a promising solution by providing both transparency and object detection capabilities to classification models. Mask detection plays a crucial role in ensuring the safety and well-being of individuals by preventing the spread of infectious diseases. A new visual XAI method called HayCAM+ is proposed to address the limitations of the previous method known as HayCAM, such as the need to select the number of filters as a hyper-parameter and the use of fully-connected layers. When object detection is performed using activation maps created via various methods, including GradCAM, EigenCAM, GradCAM++, LayerCAM, HayCAM, and HayCAM+, it is found that HayCAM+ provides the best results with an IoU score of 0.3740 (GradCAM: 0.1922, GradCAM++: 0.2472, EigenCAM: 0.3386, LayerCAM: 0.2476, HayCAM: 0.3487) and a Dice score of 0.5376 (GradCAM: 0.3153, GradCAM++: 0.3923, EigenCAM: 0.5003, LayerCAM: 0.3928, HayCAM: 0.5098). By using dynamical dimension reduction to eliminate unrelated filters in the last convolutional layer, HayCAM+ generates more focused activation maps. The results demonstrate that HayCAM+ is an advanced activation map method for explaining decisions and detecting objects using deep classification models. [ABSTRACT FROM AUTHOR]

    : Copyright of Traitement du Signal is the property of International Information & Engineering Technology Association (IIETA) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  3. 43
    دورية أكاديمية

    المؤلفون: Wu, Yixun1 (AUTHOR), Wang, Taiyu1 (AUTHOR), Gu, Runze1 (AUTHOR), Liu, Chao1 (AUTHOR) lctj@tongji.edu.cn, Xu, Boqiang1 (AUTHOR)

    المصدر: Journal of Intelligent & Fuzzy Systems. 2024, Vol. 46 Issue 2, p5377-5389. 13p.

    مستخلص: In order to address the problem of decreased accuracy in vehicle object detection models when facing low-light conditions in nighttime environments, this paper proposes a method to enhance the accuracy and precision of object detection by using the image translation technology based on the Generative Adversarial Network (GAN) in the field of computer vision, specifically the CycleGAN, from the perspective of improving the training set of object detection models. This is achieved by transforming the existing well-established daytime vehicle dataset into a nighttime vehicle dataset. The proposed method adopts a comparative experimental approach to obtain translation models with different degrees of fitting by changing the training set capacity, and selects the optimal model based on the evaluation of the effect. The translated dataset is then used to train the YOLO-v5-based object detection model, and the quality of the nighttime dataset is evaluated through the evaluation of annotation confidence and effectiveness. The research results indicate that utilizing the translated nighttime vehicle dataset for training the object detection model can increase the area under the PR curve and the peak F1 score by 10.4% and 9% respectively. This approach improves the annotation accuracy and precision of vehicle object detection models in nighttime environments without requiring additional labeling of vehicles in monitoring videos. [ABSTRACT FROM AUTHOR]

    : Copyright of Journal of Intelligent & Fuzzy Systems is the property of IOS Press and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  4. 44
    دورية أكاديمية

    المؤلفون: Zhigang, Liu1 (AUTHOR) 37181093@qq.com, Baoshan, Sun2 (AUTHOR) sunbaoshan@tiangong.edu.cn, Kaiyu, Bi2 (AUTHOR) bikaiyu01@163.com

    المصدر: International Journal of Computational Intelligence & Applications. Mar2024, Vol. 23 Issue 1, p1-22. 22p.

    مستخلص: With the rapid development of deep learning technology, object detection algorithms have made significant breakthroughs in the field of computer vision. However, due to the complexity and computational requirements of deep Convolutional Neural Network (CNN), these models face many challenges in practical applications, especially on resource-constrained edge devices. To address this problem, researchers have proposed many lightweight methods that aim to reduce the model size and computational complexity while maintaining high performance. The popularity of mobile devices and embedded systems has led to an increasing demand for lightweight models. However, existing lightweight methods often lead to accuracy loss, limiting their feasibility in practical applications. Therefore, how to realize the light weight of the model while maintaining high accuracy has become an urgent problem to be solved. To address this challenge, this paper proposes a lightweight YOLOv7 method based on PConv, Squeeze-and-Excitation (SE) attention mechanism and Wise-IoU (WIoU), which we refer to as YOLOv7-PSW. PConv can effectively reduce the number of parameters and computational complexity. The SE can help the model focus on important feature information, thereby improving performance. WIoU is introduced to measure the similarity between the detection box and the Ground Truth, so that the model can effectively reduce the False Positive rate. By applying these advanced techniques to the YOLOv7, we achieve a lightweight model while maintaining a high detection accuracy. Experimental results on PASCAL VOC dataset show that YOLOv7-PSW performs better than the original YOLOv7 on object detection tasks. The number of parameters is reduced by 12.3%, FLOPs is reduced by 18.86%, and the accuracy is improved by about 0.5%. While the detection accuracy is not decreased or even slightly improved, the number of FLOPs and parameters is greatly reduced, which realizes lightweight to a certain extent. The proposed method can provide new ideas and directions for the subsequent research on lightweight object detection, and is expected to promote its application on edge devices. Meanwhile, YOLOv7-PSW can also be applied to other computer vision tasks to improve its performance and efficiency. In summary, the proposed YOLOv7-PSW lightweight method realizes the light weight of the model while maintaining high accuracy. This is of great significance for promoting the application of object detection algorithms on edge devices. [ABSTRACT FROM AUTHOR]

    : Copyright of International Journal of Computational Intelligence & Applications is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  5. 45
    دورية أكاديمية

    المؤلفون: Zhou, Wen1 (AUTHOR) zhouwen@retryjsgzs.wecom.work, Wang, Xiaodon2 (AUTHOR) bruce@haitongit.com, Fan, Yusheng1,3 (AUTHOR) ysfan666@163.com, Yang, Yishuai4 (AUTHOR) yangyishuai0801@163.com, Wen, Yihan5 (AUTHOR) wenyihan4396@gmail.com, Li, Yixuan6 (AUTHOR) eason20230514@163.com, Xu, Yicheng1 (AUTHOR) YichengXu421@163.com, Lin, Zhengyuan7 (AUTHOR), Chen, Langlang1 (AUTHOR) 1065348206@qq.com, Yao, Shizhou1 (AUTHOR) 1095149556@qq.com, Zequn, Liu1 (AUTHOR) 1694815257@qq.com, Wang, Jianqing8 (AUTHOR) laowang198899@sina.com

    المصدر: Computer Communications. Apr2024, Vol. 219, p271-281. 11p.

    مستخلص: With the development of computer vision, small object detection has become a research pain point and difficulty in computer vision. Feature acquisition and accurate localization of small objects are two serious challenges that exist for small objects at present. In this paper, a generalized small object detection algorithm is formed based on a multi-scale feature extractor, a feature search network with hybrid attention mechanism, and knowledge distillation. The algorithm firstly performs feature extraction of small objects based on multi-scale feature extractor, secondly uses CBAM attention mechanism and Efficient network to perform feature search on features obtained from the feature map to help obtain more features of the small object, and finally performs knowledge distillation on the baseline model based on the idea of teacher–student knowledge distillation to help the baseline model locate the detected object. In this paper, YOLOv5s is selected as the benchmark experiment, and the designed algorithm is fused to YOLOv5s, compared with the baseline model, the fused model's experimental metrics mAP on the VOC mixed dataset is improved by 14.45% on average. The experimental results show that the designed algorithm can effectively improve the detection performance of the object detection model for small objects. • The KDSAMLL algorithm, a lightweight small target detection algorithm, was fused into the YOLOV5s benchmark network test and its mAP metrics improved by 24.5% on the VOC2007+2012 dataset. • Knowledge distillation of the teacher–student model reduces the training load imposed by the feature module and improves model detection to some extent. • Based on the specific small target feature extraction layer can help to identify and localize to more small target objects, and then based on the feature module composed. • CBAM and EfficientNet network can largely obtain more effective small target object features, which in turn improves the model's detection ability. [ABSTRACT FROM AUTHOR]

    : Copyright of Computer Communications is the property of Elsevier B.V. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  6. 46
    دورية أكاديمية

    المؤلفون: Mou, Chao1,2 (AUTHOR), Zhu, Chengcheng1,2 (AUTHOR), Liu, Tengfei1,2 (AUTHOR), Cui, Xiaohui1,2 (AUTHOR) cuixiaohui@bjfu.edu.cn

    المصدر: IET Image Processing (Wiley-Blackwell). Apr2024, Vol. 18 Issue 5, p1296-1314. 19p.

    مستخلص: Efficient animal detection is essential for biodiversity protection. Unmanned aerial vehicles (UAVs) have been widely used because of their low costs and minimal environmental intrusion. However, using UAVs for practical animal detection poses two challenges: (a) the UAV's fly highly to avoid disturbing animals, resulting in small object detection problems; (b) the limited processing power of UAVs makes large state‐of‐the‐art (SOTA) methods (e.g., You Only Look Once V7, YOLOv7) difficult to deploy. This work proposes the WILD‐YOLO based on YOLOv7 to deal with the two problems. To detect small objects, WILD‐YOLO improves upon YOLOv7 by adding a small object detection head in the head part. To enable real‐time animal detection in field environments with UAVs, the lighten FasterNet and GhostNet have been used to significantly reduce the model size. Compared to YOLOv7, WILD‐YOLO significantly reduces the number of parameters, making it suitable for lightweight deployment on UAVs. Additionally, comparisons with other lightweight models such as YOLOv7‐tiny, YOLOv5‐s, YOLOv4‐s and MobilenetV2 on the datasets are conducted. The experimental results demonstrate that this proposed WILD‐YOLO method outperforms other approaches and has great potential for effective detection of wildlife in complex environments encountered by UAVs. [ABSTRACT FROM AUTHOR]

    : Copyright of IET Image Processing (Wiley-Blackwell) is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  7. 47
    دورية أكاديمية

    المؤلفون: Xiao, Yun1 (AUTHOR), Liao, Hai1 (AUTHOR) hailiao_work@163.com

    المصدر: IET Image Processing (Wiley-Blackwell). Apr2024, Vol. 18 Issue 5, p1178-1188. 11p.

    مستخلص: The low‐light environment is integral to everyday activities but poses significant challenges in object detection. Due to the low brightness, noise, and insufficient illumination of the acquired image, the model's object detection performance is reduced. Opposing recent studies mainly developing using supervised learning models, this paper suggests LIDA‐YOLO, an approach for unsupervised adaptation of low‐illumination object detectors. The model improves the YOLOv3 by using normal illumination images as the source domain and low‐illumination images as the target domain and achieves object detection in low‐illumination images through an unsupervised learning strategy. Specifically, a multi‐scale local feature alignment and global feature alignment module are proposed to align the overall attributes of the image and feature biases such as background, scene, and target layout are thus reduced. The experimental results of LIDA‐YOLO on the ExDark dataset achieved the highest performance mAP score of 56.65% compared to several current state‐of‐the‐art unsupervised domain adaptation object detection methods. Compared to I3Net, the performance improvement is 4.04%, and compared to OSHOT, the performance improvement is 6.5%. LIDA‐YOLO achieves a performance improvement of 2.7% compared to the supervised baseline method YOLOv3. Overall, the suggested LIDA‐YOLO model requires fewer samples and presents a stronger generalization ability than previous works. [ABSTRACT FROM AUTHOR]

    : Copyright of IET Image Processing (Wiley-Blackwell) is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  8. 48
    دورية أكاديمية

    المؤلفون: Ji, Yaxin1 (AUTHOR), Di, Lan1 (AUTHOR) dilan@jiangnan.edu.cn

    المصدر: IET Image Processing (Wiley-Blackwell). 2/7/2024, Vol. 18 Issue 2, p412-427. 16p.

    مستخلص: This paper presents a textile defect detection method that utilizes a multi‐proportion spatial attention mechanism and channel memory feature fusion network by addressing the difficulties presented by complicated shapes and large size variations. In particular, a multi‐proportion spatial attention mechanism (MPAM) is introduced, which employs multi‐proportion convolution to improve the backbone network's capacity to detect non‐uniform structural defects. Additionally, the generality and adaptability of the model are enhanced by a multi‐scale spatial pyramid pooling structure (MS‐SPP). Second, a channel attention mechanism‐based memory feature fusion network is developed, which incorporates channel attention to adaptive weight the feature channels, focusing on crucial information channels to efficiently fuse contextual features and enhance the model's memory capacity. Finally, a novel efficient Wise‐IoU (EWIoU) loss function is proposed, which utilizes a dynamic non‐monotonic focusing mechanism to increase the penalty on distance measurement, thus enhancing the model's detection performance. Experiment findings on the ZJU‐Leaper and Tianchi textile datasets reveal that compared to the YOLOv7 baseline, the method in this paper has an increase of 6.5 and 2 percentage points, respectively, and the detection accuracy is better than most existing networks. [ABSTRACT FROM AUTHOR]

    : Copyright of IET Image Processing (Wiley-Blackwell) is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  9. 49
    دورية أكاديمية

    المؤلفون: Guan, Shengxian1,2 (AUTHOR), Dong, Shuai1 (AUTHOR), Gao, Yuefang2 (AUTHOR), Zou, Kun1 (AUTHOR) cszoukun@foxmail.com

    المصدر: IET Image Processing (Wiley-Blackwell). 2/7/2024, Vol. 18 Issue 2, p362-378. 17p.

    مستخلص: Cross‐domain object detection aims to generalize the distribution of features extracted by an object detector from an annotated domain to an unknown and unlabelled domain. Although one‐stage cross‐domain object detectors have significant advantages in deployment than two‐stage ones, they suffer from two problems. First, neglect of category features and inaccurate alignment between multiple category features would lead to decreased domain adaptation efficiency. Second, one‐stage detectors are more sensitive to imbalance of samples and negative samples severely affect the alignment process of domain adaptation. To overcome these two problems, an innovative category‐related attention domain adaptive method that refines discrimination for each category's feature has been proposed in this paper. In the proposed method, a group of domain discriminators is assigned to each category to refine the fine‐grained features between categories. The discriminators are trained via an adversarial discriminant framework to align the fine‐grained distributions cross different domains. A category attention alignment (CAA) module is proposed to navigate more attention to the foreground regions in instance‐level, which effectively alleviates the negative migration problem caused by the positive and negative sample imbalance of the one‐stage detector. Specifically, two sub‐modules in the CAA module are developed: a local CAA module and a global CAA module. These modules aim to optimize the domain offsets in both the local and global dimensions. In addition, a progressive global alignment module is designed to align image‐level features, offering prior knowledge of migration for the CAA module. The progressive global alignment module and CAA module collaboratively engage in benign competition with the backbone network across various levels. Extensive transferring experiments are conducted among cityscapes, foggy cityscapes, SIM10K, and KITTI. Experimental results show that the proposed method has much superior performance than other one‐stage cross‐domain detectors. [ABSTRACT FROM AUTHOR]

    : Copyright of IET Image Processing (Wiley-Blackwell) is the property of Wiley-Blackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)

  10. 50
    دورية أكاديمية

    المؤلفون: Yan, Chunman1,2 (AUTHOR) yancm2022@163.com, Zhang, Xiao1 (AUTHOR) 2022222483@nwnu.edu.cn

    المصدر: International Journal of Pattern Recognition & Artificial Intelligence. Feb2024, Vol. 38 Issue 2, p1-23. 23p.

    مستخلص: Head Pose Estimation (HPE) has a wide range of applications in computer vision, but still faces challenges: (1) Existing studies commonly use Euler angles or quaternions as pose labels, which may lead to discontinuity problems. (2) HPE does not effectively address regression via rotated matrices. (3) There is a low recognition rate in complex scenes, high computational requirements, etc. This paper presents an improved unconstrained HPE model to address these challenges. First, a rotation matrix form is introduced to solve the problem of unclear rotation labels. Second, a continuous 6D rotation matrix representation is used for efficient and robust direct regression. The RepVGG-A2 lightweight framework is used for feature extraction, and by adding a multi-level feature fusion module and a coordinate attention mechanism with residual connection, to improve the network's ability to perceive contextual information and pay attention to features. The model's accuracy was further improved by replacing the network activation function and improving the loss function. Experiments on the BIWI dataset 7:3 dividing the training and test sets show that the average absolute error of HPE for the proposed network model is 2.41. Trained on the dataset 300W_LP and tested on the AFLW2000 and BIWI datasets, the average absolute errors of HPE of the proposed network model are 4.34 and 3.93. The experimental results demonstrate that the improved network has better HPE performance. [ABSTRACT FROM AUTHOR]

    : Copyright of International Journal of Pattern Recognition & Artificial Intelligence is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)