يعرض 1 - 10 نتائج من 10,744 نتيجة بحث عن '"Cheng, Kai"', وقت الاستعلام: 0.86s تنقيح النتائج
  1. 1
    تقرير

    الوصف: Recently, 3D Gaussian Splatting(3DGS) has revolutionized neural rendering with its high-quality rendering and real-time speed. However, when it comes to indoor scenes with a significant number of textureless areas, 3DGS yields incomplete and noisy reconstruction results due to the poor initialization of the point cloud and under-constrained optimization. Inspired by the continuity of signed distance field (SDF), which naturally has advantages in modeling surfaces, we present a unified optimizing framework integrating neural SDF with 3DGS. This framework incorporates a learnable neural SDF field to guide the densification and pruning of Gaussians, enabling Gaussians to accurately model scenes even with poor initialized point clouds. At the same time, the geometry represented by Gaussians improves the efficiency of the SDF field by piloting its point sampling. Additionally, we regularize the optimization with normal and edge priors to eliminate geometry ambiguity in textureless areas and improve the details. Extensive experiments in ScanNet and ScanNet++ show that our method achieves state-of-the-art performance in both surface reconstruction and novel view synthesis.

    الوصول الحر: http://arxiv.org/abs/2405.19671Test

  2. 2
    تقرير

    الوصف: We present DC-Gaussian, a new method for generating novel views from in-vehicle dash cam videos. While neural rendering techniques have made significant strides in driving scenarios, existing methods are primarily designed for videos collected by autonomous vehicles. However, these videos are limited in both quantity and diversity compared to dash cam videos, which are more widely used across various types of vehicles and capture a broader range of scenarios. Dash cam videos often suffer from severe obstructions such as reflections and occlusions on the windshields, which significantly impede the application of neural rendering techniques. To address this challenge, we develop DC-Gaussian based on the recent real-time neural rendering technique 3D Gaussian Splatting (3DGS). Our approach includes an adaptive image decomposition module to model reflections and occlusions in a unified manner. Additionally, we introduce illumination-aware obstruction modeling to manage reflections and occlusions under varying lighting conditions. Lastly, we employ a geometry-guided Gaussian enhancement strategy to improve rendering details by incorporating additional geometry priors. Experiments on self-captured and public dash cam videos show that our method not only achieves state-of-the-art performance in novel view synthesis, but also accurately reconstructing captured scenes getting rid of obstructions.
    Comment: 9 pages,7 figures;project page: https://linhanwang.github.io/dcgaussianTest/

    الوصول الحر: http://arxiv.org/abs/2405.17705Test

  3. 3
    تقرير

    الوصف: Explicit Congestion Notification (ECN)-based congestion control schemes have been widely adopted in high-speed data center networks (DCNs), where the ECN marking threshold plays a determinant role in guaranteeing a packet lossless DCN. However, existing approaches either employ static settings with immutable thresholds that cannot be dynamically self-adjusted to adapt to network dynamics, or fail to take into account many-to-one traffic patterns and different requirements of different types of traffic, resulting in relatively poor performance. To address these problems, this paper proposes a novel learning-based automatic ECN tuning scheme, named PET, based on the multi-agent Independent Proximal Policy Optimization (IPPO) algorithm. PET dynamically adjusts ECN thresholds by fully considering pivotal congestion-contributing factors, including queue length, output data rate, output rate of ECN-marked packets, current ECN threshold, the extent of incast, and the ratio of mice and elephant flows. PET adopts the Decentralized Training and Decentralized Execution (DTDE) paradigm and combines offline and online training to accommodate network dynamics. PET is also fair and readily deployable with commodity hardware. Comprehensive experimental results demonstrate that, compared with state-of-the-art static schemes and the learning-based automatic scheme, our PET achieves better performance in terms of flow completion time, convergence rate, queue length variance, and system robustness.

    الوصول الحر: http://arxiv.org/abs/2405.11956Test

  4. 4
    تقرير

    الوصف: Language-guided scene-aware human motion generation has great significance for entertainment and robotics. In response to the limitations of existing datasets, we introduce LaserHuman, a pioneering dataset engineered to revolutionize Scene-Text-to-Motion research. LaserHuman stands out with its inclusion of genuine human motions within 3D environments, unbounded free-form natural language descriptions, a blend of indoor and outdoor scenarios, and dynamic, ever-changing scenes. Diverse modalities of capture data and rich annotations present great opportunities for the research of conditional motion generation, and can also facilitate the development of real-life applications. Moreover, to generate semantically consistent and physically plausible human motions, we propose a multi-conditional diffusion model, which is simple but effective, achieving state-of-the-art performance on existing datasets.

    الوصول الحر: http://arxiv.org/abs/2403.13307Test

  5. 5
    تقرير

    الوصف: The advent of 3D Gaussian Splatting (3DGS) has recently brought about a revolution in the field of neural rendering, facilitating high-quality renderings at real-time speed. However, 3DGS heavily depends on the initialized point cloud produced by Structure-from-Motion (SfM) techniques. When tackling with large-scale scenes that unavoidably contain texture-less surfaces, the SfM techniques always fail to produce enough points in these surfaces and cannot provide good initialization for 3DGS. As a result, 3DGS suffers from difficult optimization and low-quality renderings. In this paper, inspired by classical multi-view stereo (MVS) techniques, we propose GaussianPro, a novel method that applies a progressive propagation strategy to guide the densification of the 3D Gaussians. Compared to the simple split and clone strategies used in 3DGS, our method leverages the priors of the existing reconstructed geometries of the scene and patch matching techniques to produce new Gaussians with accurate positions and orientations. Experiments on both large-scale and small-scale scenes validate the effectiveness of our method, where our method significantly surpasses 3DGS on the Waymo dataset, exhibiting an improvement of 1.15dB in terms of PSNR.
    Comment: See the project page for code, data: https://kcheng1021.github.io/gaussianpro.github.ioTest

    الوصول الحر: http://arxiv.org/abs/2402.14650Test

  6. 6
    تقرير

    الوصف: Monocular depth estimation from RGB images plays a pivotal role in 3D vision. However, its accuracy can deteriorate in challenging environments such as nighttime or adverse weather conditions. While long-wave infrared cameras offer stable imaging in such challenging conditions, they are inherently low-resolution, lacking rich texture and semantics as delivered by the RGB image. Current methods focus solely on a single modality due to the difficulties to identify and integrate faithful depth cues from both sources. To address these issues, this paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework. Concretely, we independently compute the coarse depth maps with separate networks by fully utilizing the individual depth cues from each modality. As the advantageous depth spreads across both modalities, we propose a novel confidence loss steering a confidence predictor network to yield a confidence map specifying latent potential depth areas. With the resulting confidence map, we propose a multi-modal fusion network that fuses the final depth in an end-to-end manner. Harnessing the proposed pipeline, our method demonstrates the ability of robust depth estimation in a variety of difficult scenarios. Experimental results on the challenging MS$^2$ and ViViD++ datasets demonstrate the effectiveness and robustness of our method.

    الوصول الحر: http://arxiv.org/abs/2402.11826Test

  7. 7
    تقرير

    مصطلحات موضوعية: Computer Science - Multimedia

    الوصف: The development of virtual try-on has revolutionized online shopping by allowing customers to visualize themselves in various fashion items, thus extending the in-store try-on experience to the cyber space. Although virtual try-on has attracted considerable research initiatives, existing systems only focus on the quality of image generation, overlooking whether the fashion item is a good match to the given person and clothes. Recognizing this gap, we propose to design a one-stop Smart Fitting Room, with the novel formulation of matching-aware virtual try-on. Following this formulation, we design a Hybrid Matching-aware Virtual Try-On Framework (HMaVTON), which combines retrieval-based and generative methods to foster a more personalized virtual try-on experience. This framework integrates a hybrid mix-and-match module and an enhanced virtual try-on module. The former can recommend fashion items available on the platform to boost sales and generate clothes that meets the diverse tastes of consumers. The latter provides high-quality try-on effects, delivering a one-stop shopping service. To validate the effectiveness of our approach, we enlist the expertise of fashion designers for a professional evaluation, assessing the rationality and diversity of the clothes combinations and conducting an evaluation matrix analysis. Our method significantly enhances the practicality of virtual try-on. The code is available at https://github.com/Yzcreator/HMaVTONTest.

    الوصول الحر: http://arxiv.org/abs/2401.16825Test

  8. 8
    تقرير

    الوصف: Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities. Despite the fast development of Neural radiance field (NeRF) techniques and their wide applications in both indoor and outdoor scenes, applying NeRF to multi-camera systems remains very challenging. This is primarily due to the inherent under-calibration issues in multi-camera setup, including inconsistent imaging effects stemming from separately calibrated image signal processing units in diverse cameras, and system errors arising from mechanical vibrations during driving that affect relative camera poses. In this paper, we present UC-NeRF, a novel method tailored for novel view synthesis in under-calibrated multi-view camera systems. Firstly, we propose a layer-based color correction to rectify the color inconsistency in different image regions. Second, we propose virtual warping to generate more viewpoint-diverse but color-consistent virtual views for color correction and 3D recovery. Finally, a spatiotemporally constrained pose refinement is designed for more robust and accurate pose calibration in multi-camera systems. Our method not only achieves state-of-the-art performance of novel view synthesis in multi-camera setups, but also effectively facilitates depth estimation in large-scale outdoor scenes with the synthesized novel views.
    Comment: See the project page for code, data: https://kcheng1021.github.io/ucnerf.github.ioTest

    الوصول الحر: http://arxiv.org/abs/2311.16945Test

  9. 9
    دورية أكاديمية

    المصدر: Brazilian Journal of Pharmaceutical Sciences. January 2018 54(2)

    الوصف: The aim of the present study was to characterize biotin-decorated docetaxel-loaded bovine serum albumin nanoparticles (DTX-BIO-BSA-NPs) and evaluate their antiproliferative activity in vitro. The particle size of prepared DTX-BIO-BSA-NPs was found to be always lower than 200 nm, with sizes of 166.9, 160.3, 159.0, 176.1 and 184.8 nm and the zeta potential was -29.51, -28.54, -36.54, -36.08 and -27.56 mV after redissolution with water for 0, 1, 2, 4 and 8 hours, respectively. The polydispersity index (PDI) was stable in the range of 0.170 - 0.178. In the in vitro drug-release study, the DTX-BIO-BSA-NPs targeted a human breast cancer cell line MCF-7 effectively. The x-ray diffraction spectrum and DSC curve of DTX-BIO-BSA-NPs suggested that docetaxel was in an amorphous or disordered crystalline phase in DTX-BIO-BSA-NPs. In vitro cytotoxicity results showed that DTX-BIO-BSA-NPs inhibits proliferation of MCF-7, SGC7901, LS-174T and A549 cells in a concentration-dependent manner after exposure to DTX-BIO-BSA-NPs for 48 hours. Taken together, these results indicate that DTX-BIO-BSA-NPs may have potential as an alternative delivery system for parenteral administration of docetaxel.

    وصف الملف: text/html

  10. 10
    تقرير

    الوصف: This report provides a concise overview of the proposed North system, which aims to achieve automatic word/syllable recognition for Taiwanese Hakka (Sixian). The report outlines three key components of the system: the acquisition, composition, and utilization of the training data; the architecture of the model; and the hardware specifications and operational statistics. The demonstration of the system has been made public at https://asrvm.iis.sinica.edu.tw/hakka_sixianTest.

    الوصول الحر: http://arxiv.org/abs/2310.03443Test