We gratefully acknowledge support from
the Simons Foundation and member institutions.

Electrical Engineering and Systems Science

New submissions

[ total of 80 entries: 1-80 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Mon, 3 Jun 24

[1]  arXiv:2405.20357 [pdf, ps, other]
Title: Encryption in ghost imaging with Kronecker products of random matrices
Comments: 5 pages, 4 figures
Subjects: Image and Video Processing (eess.IV); Applied Physics (physics.app-ph); Optics (physics.optics)

By forming measurement matrices with the Kronecker product of two random matrices, image encryption in computational ghost imaging is investigated. The two-dimensional images are conveniently reconstructed with the pseudo-inverse matrices of the two random matrices. To suppress the noise, the method of truncated singular value decomposition can be applied to either or both of the two pseudo-inverse matrices. Further, our proposal facilitates for image encryption since more matrices can be involved in forming the measurement matrix. Two permutation matrices are inserted into the matrix sequence. The image information can only be reconstructed with the correct permutation matrices and the matrix sequence in image decryption. The experimental results show the facilitations our proposal. The technique paves the way for the practicality and flexibility of computational ghost imaging.

[2]  arXiv:2405.20387 [pdf, ps, other]
Title: Sensitivity Analysis for Piecewise-Affine Approximations of Nonlinear Programs with Polytopic Constraints
Comments: 6 pages, 4 figures, accepted for publication in IEEE Control Systems Letters
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

Nonlinear Programs (NLPs) are prevalent in optimization-based control of nonlinear systems. Solving general NLPs is computationally expensive, necessitating the development of fast hardware or tractable suboptimal approximations. This paper investigates the sensitivity of the solutions of NLPs with polytopic constraints when the nonlinear continuous objective function is approximated by a PieceWise-Affine (PWA) counterpart. By leveraging perturbation analysis using a convex modulus, we derive guaranteed bounds on the distance between the optimal solution of the original polytopically-constrained NLP and that of its approximated formulation. Our approach aids in determining criteria for achieving desired solution bounds. Two case studies on the Eggholder function and nonlinear model predictive control of an inverted pendulum demonstrate the theoretical results.

[3]  arXiv:2405.20392 [pdf, other]
Title: Can No-Reference Quality-Assessment Methods Serve as Perceptual Losses for Super-Resolution?
Comments: 4 pages, 3 figures. The first two authors contributed equally to this work
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Perceptual losses play an important role in constructing deep-neural-network-based methods by increasing the naturalness and realism of processed images and videos. Use of perceptual losses is often limited to LPIPS, a fullreference method. Even though deep no-reference image-qualityassessment methods are excellent at predicting human judgment, little research has examined their incorporation in loss functions. This paper investigates direct optimization of several video-superresolution models using no-reference image-quality-assessment methods as perceptual losses. Our experimental results show that straightforward optimization of these methods produce artifacts, but a special training procedure can mitigate them.

[4]  arXiv:2405.20402 [pdf, other]
Title: Cross-Talk Reduction
Comments: in International Joint Conference on Artificial Intelligence (IJCAI), 2024
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD); Signal Processing (eess.SP)

While far-field multi-talker mixtures are recorded, each speaker can wear a close-talk microphone so that close-talk mixtures can be recorded at the same time. Although each close-talk mixture has a high signal-to-noise ratio (SNR) of the wearer, it has a very limited range of applications, as it also contains significant cross-talk speech by other speakers and is not clean enough. In this context, we propose a novel task named cross-talk reduction (CTR) which aims at reducing cross-talk speech, and a novel solution named CTRnet which is based on unsupervised or weakly-supervised neural speech separation. In unsupervised CTRnet, close-talk and far-field mixtures are stacked as input for a DNN to estimate the close-talk speech of each speaker. It is trained in an unsupervised, discriminative way such that the DNN estimate for each speaker can be linearly filtered to cancel out the speaker's cross-talk speech captured at other microphones. In weakly-supervised CTRnet, we assume the availability of each speaker's activity timestamps during training, and leverage them to improve the training of unsupervised CTRnet. Evaluation results on a simulated two-speaker CTR task and on a real-recorded conversational speech separation and recognition task show the effectiveness and potential of CTRnet.

[5]  arXiv:2405.20433 [pdf, other]
Title: Efficient Industrial Refrigeration Scheduling with Peak Pricing
Subjects: Systems and Control (eess.SY)

The widespread use of industrial refrigeration systems across various sectors contribute significantly to global energy consumption, highlighting substantial opportunities for energy conservation through intelligent control design. As such, this work focuses on control algorithm design in industrial refrigeration that minimize operational costs and provide efficient heat extraction. By adopting tools from inventory control, we characterize the structure of these optimal control policies, exploring the impact of different energy cost-rate structures such as time-of-use (TOU) pricing and peak pricing. While classical threshold policies are optimal under TOU costs, introducing peak pricing challenges their optimality, emphasizing the need for carefully designed control strategies in the presence of significant peak costs. We provide theoretical findings and simulation studies on this phenomenon, offering insights for more efficient industrial refrigeration management.

[6]  arXiv:2405.20449 [pdf, other]
Title: Optimization, guidance, and control of low-thrust transfers from the Lunar Gateway to low lunar orbit
Comments: 19 pages, 12 figures, IAC 2023, ACTA ASTRONAUTICA 2024
Subjects: Systems and Control (eess.SY); Dynamical Systems (math.DS); Optimization and Control (math.OC); Space Physics (physics.space-ph)

The Gateway will represent a primary space system useful for the Artemis program, Earth-Moon transportation, and deep space exploration. It is expected to serve as a staging location on the way to the lunar surface. This study focuses on low-thrust transfer dynamics, from the Near-Rectilinear Halo Orbit traveled by Gateway to a specified Low-altitude Lunar Orbit (LLO). This research addresses: (i) determination of the minimum-time low-thrust trajectory and (ii) design, implementation, and testing of a guidance and control architecture, for a space vehicle that travels from Gateway to LLO. Orbit dynamics is described in terms of modified equinoctial elements, in the context of a high-fidelity ephemeris model. The minimum-time trajectory from Gateway to a specified lunar orbit is detected through an indirect heuristic approach, which uses the analytical conditions arising in optimal control theory in conjunction with a heuristic technique. However, future missions will pursue a growing level of autonomy, and this circumstance implies the mandatory design of an efficient feedback guidance scheme, capable of compensating for nonnominal flight conditions. This research proposes nonlinear orbit control as a viable option for autonomous explicit guidance of low-thrust transfers from Gateway to LLO. This approach allows defining a feedback law that enjoys quasi-global stability properties without requiring any offline reference trajectory. The overall spacecraft dynamics is modeled including attitude control and actuation. The latter is demanded to an array of reaction wheels, arranged in a pyramidal configuration. Guidance, attitude control, and actuation are implemented in an iterative scheme. Monte Carlo simulations demonstrate that the guidance and control architecture is effective with random starting points from Gateway and the temporary unavailability of the propulsion system.

[7]  arXiv:2405.20458 [pdf, other]
Title: Contingency-Aware Station-Keeping Control of Halo Orbits
Subjects: Systems and Control (eess.SY)

We present an algorithm to perform fuel-optimal stationkeeping for spacecraft in unstable halo orbits with additional constraints to ensure safety in the event of a control failure. We formulate a convex trajectory-optimization problem to generate impulsive spacecraft maneuvers to loosely track a halo orbit using a receding-horizon controller. Our solution also provides a safe exit strategy in the event that propulsion is lost at any point in the mission. We validate our algorithm in simulations of the three-body Earth-Moon and Saturn-Enceladus systems, demonstrating both low total delta-v and a safe contingency plan throughout the mission.

[8]  arXiv:2405.20471 [pdf, other]
Title: Equivalent External Noise Temperature of Time-Varying Receivers
Comments: 12 pages, 8 figures. Submitted to IEEE Transactions on Antennas and Propagation May 30, 2024
Subjects: Systems and Control (eess.SY)

The equivalent external noise temperature of time-varying antennas is studied using the concept of cross-frequency effective aperture, which quantifies the intermodulation conversion of external noise across the frequency spectrum into a receiver's operational bandwidth. The theoretical tools for this approach are laid out following the classical method for describing external noise temperature of linear time-invariant antennas, with generalizations made along the way to capture the effects of time-varying components or materials. The results demonstrate the specific ways that a time-varying system's noise characteristics are dependent on its cross-frequency effective aperture and the broadband noise environment. The general theory is applied to several examples, including abstract models of hypothetical systems, antennas integrated with parametric amplification, and time-modulated arrays.

[9]  arXiv:2405.20489 [pdf, other]
Title: Stability-Constrained Learning for Frequency Regulation in Power Grids with Variable Inertia
Comments: This paper is to appear in IEEE Control System Letters (L-CSS)
Subjects: Systems and Control (eess.SY)

The increasing penetration of converter-based renewable generation has resulted in faster frequency dynamics, and low and variable inertia. As a result, there is a need for frequency control methods that are able to stabilize a disturbance in the power system at timescales comparable to the fast converter dynamics. This paper proposes a combined linear and neural network controller for inverter-based primary frequency control that is stable at time-varying levels of inertia. We model the time-variance in inertia via a switched affine hybrid system model. We derive stability certificates for the proposed controller via a quadratic candidate Lyapunov function. We test the proposed control on a 12-bus 3-area test network, and compare its performance with a base case linear controller, optimized linear controller, and finite-horizon Linear Quadratic Regulator (LQR). Our proposed controller achieves faster mean settling time and over 50% reduction in average control cost across $100$ inertia scenarios compared to the optimized linear controller. Unlike LQR which requires complete knowledge of the inertia trajectories and system dynamics over the entire control time horizon, our proposed controller is real-time tractable, and achieves comparable performance to LQR.

[10]  arXiv:2405.20496 [pdf, other]
Title: Investigations into Uncertain Control Co-Design Implementations for stochastic in expectation and worst-case robust
Comments: 16 pages and 8 figures
Subjects: Systems and Control (eess.SY)

As uncertainty considerations become increasingly important aspects of concurrent plant and control optimization, it is imperative to identify and compare the impact of uncertain control co-design (UCCD) formulations on their associated solutions. While previous work has developed the theory for various UCCD formulations, their implementation, along with an in-depth discussion of the structure of UCCD problems, implicit assumptions, method-dependent considerations, and practical insights, is currently missing from the literature. Therefore, in this study, we address some of these limitations by proposing two optimal control structures for UCCD problems that we refer to as the open-loop single-control (OLSC) and open-loop multiple-control (OLMC). Next, we implement the stochastic in expectation UCCD (SE-UCCD) and worst-case robust UCCD (WCR-UCCD) for a simplified strain-actuated solar array (SASA) case study. For the implementation of SE-UCCD, we use generalized Polynomial Chaos expansion and benchmark the results against Monte Carlo Simulation. Next, we solve a simple SASA WCR-UCCD through OLSC and OLMC structures. Insights from such implementations indicate that constructing, implementing, and solving a UCCD problem requires an in-depth understanding of the problem at hand, formulations, and solution strategies to best address the underlying co-design under uncertainty questions.

[11]  arXiv:2405.20502 [pdf, ps, other]
Title: Reach-Avoid Control Synthesis for a Quadrotor UAV with Formal Safety Guarantees
Subjects: Systems and Control (eess.SY); Dynamical Systems (math.DS); Optimization and Control (math.OC)

Reach-avoid specifications are one of the most common tasks in autonomous aerial vehicle (UAV) applications. Despite the intensive research and development associated with control of aerial vehicles, generating feasible trajectories though complex environments and tracking them with formal safety guarantees remain challenging. In this paper, we propose a control framework for a quadrotor UAV that enables accomplishing reach-avoid tasks with formal safety guarantees. In this proposed framework, we integrate geometric control theory for tracking and polynomial trajectory generation using Bezier curves, where tracking errors are accounted for in the trajectory synthesis process. To estimate the tracking errors, we revisit the stability analysis of the closed-loop quadrotor system, when geometric control is implemented. We show that the tracking error dynamics exhibit local exponential stability when geometric control is implemented with any positive control gains, and we derive tight uniform bounds of the tracking error. We also introduce sufficient conditions to be imposed on the desired trajectory utilizing the derived uniform bounds to ensure the well-definedness of the closed-loop system. For the trajectory synthesis, we present an efficient algorithm that enables constructing a safe tube by means of sampling-based planning and safe hyper-rectangular set computations. Then, we compute the trajectory, given as a piecewise continuous Bezier curve, through the safe tube, where a heuristic efficient approach that utilizes iterative linear programming is employed. We present extensive numerical simulations with a cluttered environment to illustrate the effectiveness of the proposed framework in reach-avoid planning scenarios.

[12]  arXiv:2405.20549 [pdf, other]
Title: Discrete-Time Implementation of Explicit Reference Governor
Subjects: Systems and Control (eess.SY)

Explicit reference governor (ERG) is an add-on unit that provides constraint handling capability to pre-stabilized systems. The main idea behind ERG is to manipulate the derivative of the applied reference in continuous time such that the satisfaction of state and input constraints is guaranteed at all times. However, ERG should be practically implemented in discrete-time. This paper studies the discrete-time implementation of ERG, and provides conditions under which the feasibility and convergence properties of the ERG framework are maintained when the updates of the applied reference are performed in discrete time. The proposed approach is validated via extensive simulation and experimental studies.

[13]  arXiv:2405.20593 [pdf, other]
Title: Excitable crawling
Comments: 5 pages, MTNS 2024 extended abstract
Subjects: Systems and Control (eess.SY); Robotics (cs.RO)

We propose and analyze the suitability of a spiking controller to engineer the locomotion of a soft robotic crawler. Inspired by the FitzHugh-Nagumo model of neural excitability, we design a bistable controller with an electrical flipflop circuit representation capable of generating spikes on-demand when coupled to the passive crawler mechanics. A proprioceptive sensory signal from the crawler mechanics turns bistability of the controller into a rhythmic spiking. The output voltage, in turn, activates the crawler's actuators to generate movement through peristaltic waves. We show through geometric analysis that this control strategy achieves endogenous crawling. The electro-mechanical sensorimotor interconnection provides embodied negative feedback regulation, facilitating locomotion. Dimensional analysis provides insights on the characteristic scales in the crawler's mechanical and electrical dynamics, and how they determine the crawling gait. Adaptive control of the electrical scales to optimally match the mechanical scales can be envisioned to achieve further efficiency, as in homeostatic regulation of neuronal circuits. Our approach can scale up to multiple sensorimotor loops inspired by biological central pattern generators.

[14]  arXiv:2405.20595 [pdf, other]
Title: Multi-Beam Integrated Sensing and Communication: State-of-the-Art, Challenges and Opportunities
Subjects: Signal Processing (eess.SP)

Integrated sensing and communication (ISAC) has been envisioned as a critical enabling technology for the next-generation wireless communication, which can realize location/motion detection of surroundings with communication devices. This additional sensing capability leads to a substantial network quality gain and expansion of the service scenarios. As the system evolves to millimeter wave (mmWave) and above, ISAC can realize simultaneous communications and sensing of the ultra-high throughput level and radar resolution with compact design, which relies on directional beamforming against the path loss. With the multi-beam technology, the dual functions of ISAC can be seamlessly incorporated at the beamspace level by unleashing the potential of joint beamforming. To this end, this article investigates the key technologies for multi-beam ISAC system. We begin with an overview of the current state-of-the-art solutions in multi-beam ISAC. Subsequently, a detailed analysis of the advantages associated with the multi-beam ISAC is provided. Additionally, the key technologies for transmitter, channel and receiver of the multi-beam ISAC are introduced. Finally, we explore the challenges and opportunities presented by multi-beam ISAC, offering valuable insights into this emerging field.

[15]  arXiv:2405.20617 [pdf, other]
Title: Large-scale Outdoor Cell-free mMIMO Channel Measurement in an Urban Scenario at 3.5 GHz
Comments: 6 pages, 6 figures, conference: VTC 2024-Fall
Subjects: Signal Processing (eess.SP)

The design of cell-free massive MIMO (CF-mMIMO) systems requires accurate, measurement-based channel models. This paper provides the first results from the by far most extensive outdoor measurement campaign for CF-mMIMO channels in an urban environment. We measured impulse responses between over 20,000 potential access point (AP) locations and 80 user equipments (UEs) at 3.5 GHz with 350 MHz bandwidth (BW). Measurements use a "virtual array" approach at the AP and a hybrid switched/virtual approach at the UE. This paper describes the sounder design, measurement environment, data processing, and sample results, particularly the evolution of the power-delay profiles (PDPs) as a function of the AP locations, and its relation to the propagation environment.

[16]  arXiv:2405.20682 [pdf, other]
Title: Impact of Phase Selection on Accuracy and Scalability in Calculating Distributed Energy Resources Hosting Capacity
Subjects: Systems and Control (eess.SY); Optimization and Control (math.OC)

Hosting capacity (HC) and dynamic operating envelopes (DOEs), defined as dynamic, time-varying HC, are calculated using three-phase optimal power flow (OPF) formulations. Due to the computational complexity of such optimisation problems, HC and DOE are often calculated by introducing certain assumptions and approximations, including the linearised OPF formulation, which we implement in the Python-based tool ppOPF. Furthermore, we investigate how assumptions of the distributed energy resource (DER) connection phase impact the objective function value and computational time in calculating HC and DOE in distribution networks of different sizes. The results are not unambiguous and show that it is not possible to determine the optimal connection phase without introducing binary variables since, no matter the case study, the highest objective function values are calculated with mixed integer OPF formulations. The difference is especially visible in a real-world low-voltage network in which the difference between different scenarios is up to 14 MW in a single day. However, binary variables make the problem computationally complex and increase computational time to several hours in the DOE calculation, even when the optimality gap different from zero is set.

[17]  arXiv:2405.20693 [pdf, other]
Title: R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

3D Gaussian splatting (3DGS) has shown promising results in image rendering and surface reconstruction. However, its potential in volumetric reconstruction tasks, such as X-ray computed tomography, remains under-explored. This paper introduces R2-Gaussian, the first 3DGS-based framework for sparse-view tomographic reconstruction. By carefully deriving X-ray rasterization functions, we discover a previously unknown integration bias in the standard 3DGS formulation, which hampers accurate volume retrieval. To address this issue, we propose a novel rectification technique via refactoring the projection from 3D to 2D Gaussians. Our new method presents three key innovations: (1) introducing tailored Gaussian kernels, (2) extending rasterization to X-ray imaging, and (3) developing a CUDA-based differentiable voxelizer. Extensive experiments demonstrate that our method outperforms state-of-the-art approaches by 0.93 dB in PSNR and 0.014 in SSIM. Crucially, it delivers high-quality results in 3 minutes, which is 12x faster than NeRF-based methods and on par with traditional algorithms. The superior performance and rapid convergence of our method highlight its practical value.

[18]  arXiv:2405.20704 [pdf, other]
Title: A flexible numerical tool for large dynamic DC networks
Comments: 17 pages, 5 figures, 3 tables. First version, all comments are welcome
Subjects: Systems and Control (eess.SY)

DC networks play an important role within the ongoing energy transition. In this context, simulations of designed and existing networks and their corresponding assets are a core tool to get insights and form a support to decision-making. Hereby, these simulations of DC networks are executed in the time domain. Due to the involved high frequencies and the used controllers, the equations that model these DC networks are stiff and highly oscillatory differential equations. By exploiting sparsity, we show that conventional adaptive time stepping schemes can be used efficiently for the time domain simulation of very large DC networks and that this scales linearly in the computational cost as the size of the networks increase.

[19]  arXiv:2405.20706 [pdf, other]
Title: IoT on the Road to Sustainability: Vehicle or Bandit?
Subjects: Signal Processing (eess.SP)

The Internet of Things (IoT) can support the evolution towards a digital and green future. However, the introduction of the technology clearly has in itself a direct adverse ecological impact. This paper assesses this impact at both the IoT-node and at the network side. For the nodes, we show that the electronics production of devices comes with a carbon footprint that can be much higher than during operation phase. We highlight that the inclusion of IoT support in existing cellular networks comes with a significant ecological penalty, raising overall energy consumption by more than 15%. These results call for novel design approaches for the nodes and for early consideration of the support for IoT in future networks. Raising the 'Vehicle or bandit?' question on the nature of IoT in the broader sense of sustainability, we illustrate the need for multidisciplinary cooperation to steer applications in desirable directions.

[20]  arXiv:2405.20733 [pdf, other]
Title: Dynamic Microgrid Formation Considering Time-dependent Contingency: A Distributionally Robust Approach
Comments: 5 pages, 5 figures, Accepted by PES General Meeting 2024
Subjects: Systems and Control (eess.SY)

The increasing frequency of extreme weather events has posed significant risks to the operation of power grids. During long-duration extreme weather events, microgrid formation (MF) is an essential solution to enhance the resilience of the distribution systems by proactively partitioning the distribution system into several microgrids to mitigate the impact of contingencies. This paper proposes a distributionally robust dynamic microgrid formation (DR-DMF) approach to fully consider the temporal characteristics of line failure probability during long-duration extreme weather events like typhoons. The boundaries of each microgrid are dynamically adjusted to enhance the resilience of the system. Furthermore, the expected load shedding is minimized by a distributionally robust optimization model considering the uncertainty of line failure probability regarding the worst-case distribution of contingencies. The effectiveness of the proposed model is verified by numerical simulations on a modified IEEE 37-node system.

[21]  arXiv:2405.20746 [pdf, ps, other]
Title: UAV-Enabled Wireless Networks with Movable-Antenna Array: Flexible Beamforming and Trajectory Design
Subjects: Signal Processing (eess.SP)

Recently, movable antenna (MA) array becomes a promising technology for improving the communication quality in wireless communication systems. In this letter, an unmanned aerial vehicle (UAV) enabled multi-user multi-input-single-output system enhanced by the MA array is investigated. To enhance the throughput capacity, we aim to maximize the achievable data rate by jointly optimizing the transmit beamforming, the UAV trajectory, and the positions of the MA array antennas. The formulated data rate maximization problem is a highly coupled non-convex problem, for which an alternating optimization based algorithm is proposed to get a sub-optimal solution. Numerical results have demonstrated the performance gain of the proposed method compared with conventional method with fixed-position antenna array.

[22]  arXiv:2405.20983 [pdf, other]
Title: Goal-Oriented Sensor Reporting Scheduling for Non-linear Dynamic System Monitoring
Subjects: Systems and Control (eess.SY)

Goal-oriented communication (GoC) is a form of semantic communication where the effectiveness of information transmission is measured by its impact on achieving the desired goal. In the context of the Internet of Things (IoT), GoC can make IoT sensors to selectively transmit data pertinent to the intended goals of the receiver. Therefore, GoC holds significant value for IoT networks as it facilitates timely decision-making at the receiver, reduces network congestion, and enhances spectral efficiency. In this paper, we consider a scenario where an edge node polls sensors monitoring the state of a non-linear dynamic system (NLDS) to respond to the queries of several clients. Our work delves into the foregoing GoC problem, which we term goal-oriented scheduling (GoS). Our proposed GoS utilizes deep reinforcement learning (DRL) with meticulously devised action space, state space, and reward function. The devised action space and reward function play a pivotal role in reducing the number of sensor transmissions. Meanwhile, the devised state space empowers our DRL scheduler to poll the sensor whose observation is expected to minimize the mean square error (MSE) of the query responses. Our numerical analysis demonstrates that the proposed GoS can either effectively minimize the query response MSE further or obtain a resembling MSE compared to benchmark scheduling methods, depending on the type of query. Furthermore, the proposed GoS proves to be energy-efficient for the sensors and of lower complexity compared to benchmark scheduling methods.

[23]  arXiv:2405.21069 [pdf, other]
Title: Very Low Complexity Speech Synthesis Using Framewise Autoregressive GAN (FARGAN) with Pitch Prediction
Comments: 5 pages
Subjects: Audio and Speech Processing (eess.AS)

Neural vocoders are now being used in a wide range of speech processing applications. In many of those applications, the vocoder can be the most complex component, so finding lower complexity algorithms can lead to significant practical benefits. In this work, we propose FARGAN, an autoregressive vocoder that takes advantage of long-term pitch prediction to synthesize high-quality speech in small subframes, without the need for teacher-forcing. Experimental results show that the proposed 600~MFLOPS FARGAN vocoder can achieve both higher quality and lower complexity than existing low-complexity vocoders. The quality even matches that of existing higher-complexity vocoders.

Cross-lists for Mon, 3 Jun 24

[24]  arXiv:2405.20410 (cross-list from cs.CL) [pdf, other]
Title: SeamlessExpressiveLM: Speech Language Model for Expressive Speech-to-Speech Translation with Chain-of-Thought
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS)

Expressive speech-to-speech translation (S2ST) is a key research topic in seamless communication, which focuses on the preservation of semantics and speaker vocal style in translated speech. Early works synthesized speaker style aligned speech in order to directly learn the mapping from speech to target speech spectrogram. Without reliance on style aligned data, recent studies leverage the advances of language modeling (LM) and build cascaded LMs on semantic and acoustic tokens. This work proposes SeamlessExpressiveLM, a single speech language model for expressive S2ST. We decompose the complex source-to-target speech mapping into intermediate generation steps with chain-of-thought prompting. The model is first guided to translate target semantic content and then transfer the speaker style to multi-stream acoustic units. Evaluated on Spanish-to-English and Hungarian-to-English translations, SeamlessExpressiveLM outperforms cascaded LMs in both semantic quality and style transfer, meanwhile achieving better parameter efficiency.

[25]  arXiv:2405.20426 (cross-list from cs.GT) [pdf, ps, other]
Title: Quality of Non-Convergent Best Response Processes in Multi-Agent Systems through Sink Equilibrium
Subjects: Computer Science and Game Theory (cs.GT); Systems and Control (eess.SY)

Examining the behavior of multi-agent systems is vitally important to many emerging distributed applications - game theory has emerged as a powerful tool set in which to do so. The main approach of game-theoretic techniques is to model agents as players in a game, and predict the emergent behavior through the relevant Nash equilibrium. The virtue from this viewpoint is that by assuming that self-interested decision-making processes lead to Nash equilibrium, system behavior can then be captured by Nash equilibrium without studying the decision-making processes explicitly. This approach has seen success in a wide variety of domains, such as sensor coverage, traffic networks, auctions, and network coordination. However, in many other problem settings, Nash equilibrium are not necessarily guaranteed to exist or emerge from self-interested processes. Thus the main focus of the paper is on the study of sink equilibrium, which are defined as the attractors of these decision-making processes. By classifying system outcomes through a global objective function, we can analyze the resulting approximation guarantees that sink equilibrium have for a given game. Our main result is an approximation guarantee on the sink equilibrium through defining an introduced metric of misalignment, which captures how uniform agents are in their self-interested decision making. Overall, sink equilibrium are naturally occurring in many multi-agent contexts, and we display our results on their quality with respect to two practical problem settings.

[26]  arXiv:2405.20559 (cross-list from physics.optics) [pdf, other]
Title: Universal evaluation and design of imaging systems using information estimation
Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Image and Video Processing (eess.IV); Data Analysis, Statistics and Probability (physics.data-an)

Information theory, which describes the transmission of signals in the presence of noise, has enabled the development of reliable communication systems that underlie the modern world. Imaging systems can also be viewed as a form of communication, in which information about the object is "transmitted" through images. However, the application of information theory to imaging systems has been limited by the challenges of accounting for their physical constraints. Here, we introduce a framework that addresses these limitations by modeling the probabilistic relationship between objects and their measurements. Using this framework, we develop a method to estimate information using only a dataset of noisy measurements, without making any assumptions about the image formation process. We demonstrate that these estimates comprehensively quantify measurement quality across a diverse range of imaging systems and applications. Furthermore, we introduce Information-Driven Encoder Analysis Learning (IDEAL), a technique to optimize the design of imaging hardware for maximum information capture. This work provides new insights into the fundamental performance limits of imaging systems and offers powerful new tools for their analysis and design.

[27]  arXiv:2405.20587 (cross-list from cs.NI) [pdf, ps, other]
Title: Quality-Aware Task Offloading for Cooperative Perception in Vehicular Edge Computing
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)

Task offloading in Vehicular Edge Computing (VEC) can advance cooperative perception (CP) to improve traffic awareness in Autonomous Vehicles. In this paper, we propose the Quality-aware Cooperative Perception Task Offloading (QCPTO) scheme. Q-CPTO is the first task offloading scheme that enhances traffic awareness by prioritizing the quality rather than the quantity of cooperative perception. Q-CPTO improves the quality of CP by curtailing perception redundancy and increasing the Value of Information (VOI) procured by each user. We use Kalman filters (KFs) for VOI assessment, predicting the next movement of each vehicle to estimate its region of interest. The estimated VOI is then integrated into the task offloading problem. We formulate the task offloading problem as an Integer Linear Program (ILP) that maximizes the VOI of users and reduces perception redundancy by leveraging the spatially diverse fields of view (FOVs) of vehicles, while adhering to strict latency requirements. We also propose the Q-CPTO-Heuristic (Q-CPTOH) scheme to solve the task offloading problem in a time-efficient manner. Extensive evaluations show that Q-CPTO significantly outperforms prominent task offloading schemes by up to 14% and 20% in terms of response delay and traffic awareness, respectively. Furthermore, Q-CPTO-H closely approaches the optimal solution, with marginal gaps of up to 1.4% and 2.1% in terms of traffic awareness and the number of collaborating users, respectively, while reducing the runtime by up to 84%.

[28]  arXiv:2405.20723 (cross-list from physics.optics) [pdf, other]
Title: Beaconless Auto-Alignment for Single-Wavelength 5 Tbit/s Mode-Division Multiplexing Free-Space Optical Communications
Subjects: Optics (physics.optics); Signal Processing (eess.SP)

Mode-division multiplexing has shown its ability to significantly increase the capacity of free-space optical communications. An accurate alignment is crucial to enable such links due to possible performance degradation induced by mode crosstalk and narrow beam divergence. Conventionally, a beacon beam is necessary for system alignment due to multiple local maximums in the mode-division multiplexed beam profile. However, the beacon beam introduces excess system complexity, power consumption, and alignment errors. Here we demonstrate a beaconless system with significantly higher alignment accuracy and faster acquisition. This system also excludes excess complexity, power consumption, and alignment errors, facilitating simplified system calibration and supporting a record-high 5.14 Tbit/s line rate in a single-wavelength free-space optical link. We anticipate our paper to be a starting point for more sophisticated alignment scenarios in future multi-Terabit mode-division multiplexing free-space optical communications for long-distance applications with a generalised mode basis.

[29]  arXiv:2405.20877 (cross-list from cs.IT) [pdf, other]
Title: Waveform Design for Over-the-Air Computing
Comments: 14 pages
Subjects: Information Theory (cs.IT); Distributed, Parallel, and Cluster Computing (cs.DC); Machine Learning (cs.LG); Signal Processing (eess.SP); Statistics Theory (math.ST)

In response to the increasing number of devices anticipated in next-generation networks, a shift toward over-the-air (OTA) computing has been proposed. Leveraging the superposition of multiple access channels, OTA computing enables efficient resource management by supporting simultaneous uncoded transmission in the time and the frequency domain. Thus, to advance the integration of OTA computing, our study presents a theoretical analysis addressing practical issues encountered in current digital communication transceivers, such as time sampling error and intersymbol interference (ISI). To this end, we examine the theoretical mean squared error (MSE) for OTA transmission under time sampling error and ISI, while also exploring methods for minimizing the MSE in the OTA transmission. Utilizing alternating optimization, we also derive optimal power policies for both the devices and the base station. Additionally, we propose a novel deep neural network (DNN)-based approach to design waveforms enhancing OTA transmission performance under time sampling error and ISI. To ensure fair comparison with existing waveforms like the raised cosine (RC) and the better-than-raised-cosine (BRTC), we incorporate a custom loss function integrating energy and bandwidth constraints, along with practical design considerations such as waveform symmetry. Simulation results validate our theoretical analysis and demonstrate performance gains of the designed pulse over RC and BTRC waveforms. To facilitate testing of our results without necessitating the DNN structure recreation, we provide curve fitting parameters for select DNN-based waveforms as well.

[30]  arXiv:2405.20884 (cross-list from cs.SD) [pdf, other]
Title: Effects of Dataset Sampling Rate for Noise Cancellation through Deep Learning
Comments: 16 pages, 8 pictures, 3 tables
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Audio and Speech Processing (eess.AS)

Background: Active noise cancellation has been a subject of research for decades. Traditional techniques, like the Fast Fourier Transform, have limitations in certain scenarios. This research explores the use of deep neural networks (DNNs) as a superior alternative. Objective: The study aims to determine the effect sampling rate within training data has on lightweight, efficient DNNs that operate within the processing constraints of mobile devices. Methods: We chose the ConvTasNET network for its proven efficiency in speech separation and enhancement. ConvTasNET was trained on datasets such as WHAM!, LibriMix, and the MS-2023 DNS Challenge. The datasets were sampled at rates of 8kHz, 16kHz, and 48kHz to analyze the effect of sampling rate on noise cancellation efficiency and effectiveness. The model was tested on a core-i7 Intel processor from 2023, assessing the network's ability to produce clear audio while filtering out background noise. Results: Models trained at higher sampling rates (48kHz) provided much better evaluation metrics against Total Harmonic Distortion (THD) and Quality Prediction For Generative Neural Speech Codecs (WARP-Q) values, indicating improved audio quality. However, a trade-off was noted with the processing time being longer for higher sampling rates. Conclusions: The Conv-TasNET network, trained on datasets sampled at higher rates like 48kHz, offers a robust solution for mobile devices in achieving noise cancellation through speech separation and enhancement. Future work involves optimizing the model's efficiency further and testing on mobile devices.

[31]  arXiv:2405.20887 (cross-list from cs.SD) [pdf, other]
Title: On the Condition Monitoring of Bolted Joints through Acoustic Emission and Deep Transfer Learning: Generalization, Ordinal Loss and Super-Convergence
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)

This paper investigates the use of deep transfer learning based on convolutional neural networks (CNNs) to monitor the condition of bolted joints using acoustic emissions. Bolted structures are critical components in many mechanical systems, and the ability to monitor their condition status is crucial for effective structural health monitoring. We evaluated the performance of our methodology using the ORION-AE benchmark, a structure composed of two thin beams connected by three bolts, where highly noisy acoustic emission measurements were taken to detect changes in the applied tightening torque of the bolts. The data used from this structure is derived from the transformation of acoustic emission data streams into images using continuous wavelet transform, and leveraging pretrained CNNs for feature extraction and denoising. Our experiments compared single-sensor versus multiple-sensor fusion for estimating the tightening level (loosening) of bolts and evaluated the use of raw versus prefiltered data on the performance. We particularly focused on the generalization capabilities of CNN-based transfer learning across different measurement campaigns and we studied ordinal loss functions to penalize incorrect predictions less severely when close to the ground truth, thereby encouraging misclassification errors to be in adjacent classes. Network configurations as well as learning rate schedulers are also investigated, and super-convergence is obtained, i.e., high classification accuracy is achieved in a few number of iterations with different networks. Furthermore, results demonstrate the generalization capabilities of CNN-based transfer learning for monitoring bolted structures by acoustic emission with varying amounts of prior information required during training.

[32]  arXiv:2405.20951 (cross-list from cs.AI) [pdf, other]
Title: Monte Carlo Tree Search Satellite Scheduling Under Cloud Cover Uncertainty
Comments: 11 pages, 4 figures
Subjects: Artificial Intelligence (cs.AI); Systems and Control (eess.SY)

Efficient utilization of satellite resources in dynamic environments remains a challenging problem in satellite scheduling. This paper addresses the multi-satellite collection scheduling problem (m-SatCSP), aiming to optimize task scheduling over a constellation of satellites under uncertain conditions such as cloud cover. Leveraging Monte Carlo Tree Search (MCTS), a stochastic search algorithm, two versions of MCTS are explored to schedule satellites effectively. Hyperparameter tuning is conducted to optimize the algorithm's performance. Experimental results demonstrate the effectiveness of the MCTS approach, outperforming existing methods in both solution quality and efficiency. Comparative analysis against other scheduling algorithms showcases competitive performance, positioning MCTS as a promising solution for satellite task scheduling in dynamic environments.

[33]  arXiv:2405.20969 (cross-list from cs.RO) [pdf, other]
Title: Design, Calibration, and Control of Compliant Force-sensing Gripping Pads for Humanoid Robots
Comments: 21 pages, 16 figures, Published in ASME Journal of Mechanisms and Robotics
Journal-ref: Journal of Mechanisms and Robotics, 15, 031010,2023
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)

This paper introduces a pair of low-cost, light-weight and compliant force-sensing gripping pads used for manipulating box-like objects with smaller-sized humanoid robots. These pads measure normal gripping forces and center of pressure (CoP). A calibration method is developed to improve the CoP measurement accuracy. A hybrid force-alignment-position control framework is proposed to regulate the gripping forces and to ensure the surface alignment between the grippers and the object. Limit surface theory is incorporated as a contact friction modeling approach to determine the magnitude of gripping forces for slippage avoidance. The integrated hardware and software system is demonstrated with a NAO humanoid robot. Experiments show the effectiveness of the overall approach.

[34]  arXiv:2405.20972 (cross-list from cs.MA) [pdf, other]
Title: Congestion-Aware Path Re-routing Strategy for Dense Urban Airspace
Subjects: Multiagent Systems (cs.MA); Systems and Control (eess.SY)

Existing UAS Traffic Management (UTM) frameworks designate preplanned flight paths to uncrewed aircraft systems (UAS), enabling the UAS to deliver payloads. However, with increasing delivery demand between the source-destination pairs in the urban airspace, UAS will likely experience considerable congestion on the nominal paths. We propose a rule-based congestion mitigation strategy that improves UAS safety and airspace utilization in congested traffic streams. The strategy relies on nominal path information from the UTM and positional information of other UAS in the vicinity. Following the strategy, UAS opts for alternative local paths in the unoccupied airspace surrounding the nominal path and avoids congested regions. The strategy results in UAS traffic exploring and spreading to alternative adjacent routes on encountering congestion. The paper presents queuing models to estimate the expected traffic spread for varying stochastic delivery demand at the source, thus helping to reserve the airspace around the nominal path beforehand to accommodate any foreseen congestion. Simulations are presented to validate the queuing results in the presence of static obstacles and intersecting UAS streams.

[35]  arXiv:2405.20987 (cross-list from cs.CV) [pdf, other]
Title: Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging
Comments: This paper is accepted at the 35th IEEE Irish Signals and Systems Conference (ISSC 2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Generative Adversarial Networks (GANs) have high computational costs to train their complex architectures. Throughout the training process, GANs' output is analyzed qualitatively based on the loss and synthetic images' diversity and quality. Based on this qualitative analysis, training is manually halted once the desired synthetic images are generated. By utilizing an early stopping criterion, the computational cost and dependence on manual oversight can be reduced yet impacted by training problems such as mode collapse, non-convergence, and instability. This is particularly prevalent in biomedical imagery, where training problems degrade the diversity and quality of synthetic images, and the high computational cost associated with training makes complex architectures increasingly inaccessible. This work proposes a novel early stopping criteria to quantitatively detect training problems, halt training, and reduce the computational costs associated with synthesizing biomedical images. Firstly, the range of generator and discriminator loss values is investigated to assess whether mode collapse, non-convergence, and instability occur sequentially, concurrently, or interchangeably throughout the training of GANs. Secondly, utilizing these occurrences in conjunction with the Mean Structural Similarity Index (MS-SSIM) and Fr\'echet Inception Distance (FID) scores of synthetic images forms the basis of the proposed early stopping criteria. This work helps identify the occurrence of training problems in GANs using low-resource computational cost and reduces training time to generate diversified and high-quality synthetic images.

[36]  arXiv:2405.21021 (cross-list from cs.LG) [pdf, other]
Title: Beyond Conventional Parametric Modeling: Data-Driven Framework for Estimation and Prediction of Time Activity Curves in Dynamic PET Imaging
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV); Dynamical Systems (math.DS)

Dynamic Positron Emission Tomography (dPET) imaging and Time-Activity Curve (TAC) analyses are essential for understanding and quantifying the biodistribution of radiopharmaceuticals over time and space. Traditional compartmental modeling, while foundational, commonly struggles to fully capture the complexities of biological systems, including non-linear dynamics and variability. This study introduces an innovative data-driven neural network-based framework, inspired by Reaction Diffusion systems, designed to address these limitations. Our approach, which adaptively fits TACs from dPET, enables the direct calibration of diffusion coefficients and reaction terms from observed data, offering significant improvements in predictive accuracy and robustness over traditional methods, especially in complex biological scenarios. By more accurately modeling the spatio-temporal dynamics of radiopharmaceuticals, our method advances modeling of pharmacokinetic and pharmacodynamic processes, enabling new possibilities in quantitative nuclear medicine.

Replacements for Mon, 3 Jun 24

[37]  arXiv:2207.11860 (replaced) [pdf, other]
Title: Behind Every Domain There is a Shift: Adapting Distortion-aware Vision Transformers for Panoramic Semantic Segmentation
Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Extended version of CVPR 2022 paper arXiv:2203.01452. Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Image and Video Processing (eess.IV)
[38]  arXiv:2211.04634 (replaced) [pdf, other]
Title: Learning Optimal Graph Filters for Clustering of Attributed Graphs
Comments: 12 pages, 7 figures
Subjects: Machine Learning (cs.LG); Social and Information Networks (cs.SI); Signal Processing (eess.SP)
[39]  arXiv:2212.00394 (replaced) [pdf, other]
Title: From CNNs to Shift-Invariant Twin Models Based on Complex Wavelets
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV); Machine Learning (stat.ML)
[40]  arXiv:2304.07248 (replaced) [pdf, ps, other]
Title: The University of California San Francisco Brain Metastases Stereotactic Radiosurgery (UCSF-BMSR) MRI Dataset
Comments: 15 pages, 2 tables, 2 figures
Journal-ref: Radiology: Artificial Intelligence. 2024;6(2):e230126
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[41]  arXiv:2304.12890 (replaced) [pdf, other]
Title: MRI Recovery with Self-Calibrated Denoisers without Fully-Sampled Data
Subjects: Image and Video Processing (eess.IV)
[42]  arXiv:2305.01461 (replaced) [pdf, other]
Title: Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy Management
Subjects: Systems and Control (eess.SY); Artificial Intelligence (cs.AI)
[43]  arXiv:2305.15255 (replaced) [pdf, other]
Title: Spoken Question Answering and Speech Continuation Using Spectrogram-Powered LLM
Comments: ICLR 2024 camera-ready
Subjects: Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[44]  arXiv:2306.10843 (replaced) [pdf, other]
Title: Female mosquito detection by means of AI techniques inside release containers in the context of a Sterile Insect Technique program
Comments: Accepted EUSIPCO 2024
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[45]  arXiv:2307.16128 (replaced) [pdf, other]
Title: Online Interior-point Methods for Time-varying Equality-constrained Optimization
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY)
[46]  arXiv:2308.12985 (replaced) [pdf, ps, other]
Title: Perimeter Control with Heterogeneous Metering Rates for Cordon Signals: A Physics-Regularized Multi-Agent Reinforcement Learning Approach
Comments: 21 pages, 24 figures
Subjects: Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
[47]  arXiv:2310.00154 (replaced) [pdf, other]
Title: Primal Dual Continual Learning: Balancing Stability and Plasticity through Adaptive Memory Allocation
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Signal Processing (eess.SP)
[48]  arXiv:2310.07557 (replaced) [pdf, other]
Title: Quality of Service-Constrained Online Routing in High Throughput Satellites
Comments: Added constraints and updated numerical results. Layout improvement
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
[49]  arXiv:2310.18953 (replaced) [pdf, other]
Title: TIC-TAC: A Framework for Improved Covariance Estimation in Deep Heteroscedastic Regression
Comments: ICML 2024. Please feel free to provide feedback!
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[50]  arXiv:2311.01479 (replaced) [pdf, other]
Title: Detecting Out-of-Distribution Through the Lens of Neural Collapse
Authors: Litian Liu, Yao Qin
Subjects: Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[51]  arXiv:2311.06983 (replaced) [pdf, other]
Title: A Different View of Sigma-Delta Modulators Under the Lens of Pulse Frequency Modulation
Authors: Victor Medina (1), Pieter Rombouts (2), Luis Hernandez (1) ((1) Carlos III University, Madrid, Spain. (2) Ghent University, Belgium.)
Comments: 15 pages, 28 figures
Subjects: Systems and Control (eess.SY)
[52]  arXiv:2311.10879 (replaced) [pdf, other]
Title: Pre- to Post-Contrast Breast MRI Synthesis for Enhanced Tumour Segmentation
Comments: Accepted as oral presentation at SPIE Medical Imaging 2024 (Image Processing)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[53]  arXiv:2311.11745 (replaced) [pdf, other]
Title: ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Comments: ICML 2024
Subjects: Sound (cs.SD); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[54]  arXiv:2401.00766 (replaced) [pdf, other]
Title: Exposure Bracketing is All You Need for Unifying Image Restoration and Enhancement Tasks
Comments: 21 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[55]  arXiv:2401.03922 (replaced) [pdf, ps, other]
Title: SNeurodCNN: Structure-focused Neurodegeneration Convolutional Neural Network for Modelling and Classification of Alzheimer's Disease
Comments: 36 Pages, 10 figures, 4 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[56]  arXiv:2401.11053 (replaced) [pdf, other]
Title: StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
Subjects: Audio and Speech Processing (eess.AS); Sound (cs.SD)
[57]  arXiv:2402.00325 (replaced) [pdf, ps, other]
Title: Using digital twins for managing change in complex projects
Comments: 11 pages, 5 figures
Subjects: Systems and Control (eess.SY)
[58]  arXiv:2402.13901 (replaced) [pdf, other]
Title: Non-asymptotic Convergence of Discrete-time Diffusion Models: New Approach and Improved Rate
Subjects: Machine Learning (cs.LG); Signal Processing (eess.SP); Machine Learning (stat.ML)
[59]  arXiv:2402.17502 (replaced) [pdf, other]
Title: FedLPPA: Learning Personalized Prompt and Aggregation for Federated Weakly-supervised Medical Image Segmentation
Comments: 12 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[60]  arXiv:2403.04626 (replaced) [pdf, other]
Title: MedFLIP: Medical Vision-and-Language Self-supervised Fast Pre-Training with Masked Autoencoder
Subjects: Image and Video Processing (eess.IV); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[61]  arXiv:2403.05826 (replaced) [pdf, other]
Title: Cached Model-as-a-Resource: Provisioning Large Language Model Agents for Edge Intelligence in Space-air-ground Integrated Networks
Subjects: Networking and Internet Architecture (cs.NI); Signal Processing (eess.SP)
[62]  arXiv:2403.11067 (replaced) [pdf, other]
Title: Signal Fidelity in Degenerate and Nondegenerate Mode Parametric Amplifier Receiving Antennas
Comments: 5 pages, 6 figures. Submitted to IEEE Antennas and Wireless Propagation Letters March 15, 2024; revised May 30, 2024
Subjects: Systems and Control (eess.SY)
[63]  arXiv:2404.01803 (replaced) [pdf, ps, other]
Title: Systematic Solutions to Login and Authentication Security Problems: A Dual-Password Login-Authentication Mechanism
Authors: Suyun Borjigin
Comments: 11 pages, 3 figures, 28 conferences
Subjects: Cryptography and Security (cs.CR); Emerging Technologies (cs.ET); Systems and Control (eess.SY)
[64]  arXiv:2404.03197 (replaced) [pdf, other]
Title: A Rolling Horizon Restoration Framework for Post-disaster Restoration of Electrical Distribution Networks
Comments: 26 pages, 17 figures
Subjects: Systems and Control (eess.SY)
[65]  arXiv:2404.07217 (replaced) [pdf, other]
Title: Attention-aware Semantic Communications for Collaborative Inference
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[66]  arXiv:2404.07402 (replaced) [pdf, other]
Title: An excursion onto Schrödinger's bridges: Stochastic flows with spatio-temporal marginals
Comments: 6 pages, 2 figures
Subjects: Optimization and Control (math.OC); Systems and Control (eess.SY); Probability (math.PR)
[67]  arXiv:2404.07989 (replaced) [pdf, other]
Title: Any2Point: Empowering Any-modality Large Models for Efficient 3D Understanding
Comments: Code and models are released at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[68]  arXiv:2405.08295 (replaced) [pdf, other]
Title: SpeechVerse: A Large-scale Generalizable Audio Language Model
Comments: Single Column, 13 page
Subjects: Computation and Language (cs.CL); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[69]  arXiv:2405.10780 (replaced) [pdf, ps, other]
Title: Intelligent and Miniaturized Neural Interfaces: An Emerging Era in Neurotechnology
Journal-ref: 2024 IEEE Custom Integrated Circuits Conference (CICC), Denver, CO, USA, 2024, pp. 1-7
Subjects: Signal Processing (eess.SP); Hardware Architecture (cs.AR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)
[70]  arXiv:2405.11386 (replaced) [pdf, other]
Title: Liver Fat Quantification Network with Body Shape
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[71]  arXiv:2405.15187 (replaced) [pdf, other]
Title: Chance-Constrained Economic Dispatch with Flexible Loads and RES
Subjects: Systems and Control (eess.SY)
[72]  arXiv:2405.15259 (replaced) [pdf, other]
Title: Robust Economic Dispatch with Flexible Demand and Adjustable Uncertainty Set
Subjects: Systems and Control (eess.SY)
[73]  arXiv:2405.15923 (replaced) [pdf, ps, other]
Title: Spiketrum: An FPGA-based Implementation of a Neuromorphic Cochlea
Comments: To be published at "IEEE Transactions on Circuits and Systems"
Subjects: Signal Processing (eess.SP); Neural and Evolutionary Computing (cs.NE); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[74]  arXiv:2405.15927 (replaced) [pdf, ps, other]
Title: Application based Evaluation of an Efficient Spike-Encoder, "Spiketrum"
Comments: To be published at "IEEE/ACM Transactions on Audio, Speech, and Language Processing"
Subjects: Signal Processing (eess.SP); Neural and Evolutionary Computing (cs.NE); Systems and Control (eess.SY)
[75]  arXiv:2405.16649 (replaced) [pdf, other]
Title: Deep Koopman Learning using the Noisy Data
Subjects: Systems and Control (eess.SY)
[76]  arXiv:2405.18558 (replaced) [pdf, other]
Title: "Golden Ratio Yoshimura" for Meta-Stable and Massively Reconfigurable Deployment
Subjects: Robotics (cs.RO); Systems and Control (eess.SY)
[77]  arXiv:2405.18669 (replaced) [pdf, other]
Title: Zipper: A Multi-Tower Decoder Architecture for Fusing Modalities
Comments: Under review at NeurIPS
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Audio and Speech Processing (eess.AS)
[78]  arXiv:2405.19542 (replaced) [pdf, other]
Title: Anatomical Region Recognition and Real-time Bone Tracking Methods by Dynamically Decoding A-Mode Ultrasound Signals
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Robotics (cs.RO)
[79]  arXiv:2405.20052 (replaced) [pdf, other]
Title: Hardware-Efficient EMG Decoding for Next-Generation Hand Prostheses
Comments: \{copyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG)
[80]  arXiv:2405.20172 (replaced) [pdf, other]
Title: Iterative Feature Boosting for Explainable Speech Emotion Recognition
Comments: Published in: 2023 International Conference on Machine Learning and Applications (ICMLA)
Journal-ref: 2023 International Conference on Machine Learning and Applications (ICMLA), Jacksonville, FL, USA, 2023, pp. 543-549
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Audio and Speech Processing (eess.AS)
[ total of 80 entries: 1-80 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, eess, recent, 2406, contact, help  (Access key information)