-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathliterature_review.txt
27 lines (13 loc) · 74 KB
/
literature_review.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
2305.12025.pdf: The paper discusses the use of memcapacitive physical reservoir computing system for temporal data processing. The system is energy-efficient and achieves a 98% accuracy rate for spoken digit classification and a normalized mean square error of 0.0012 in a second-order non-linear regression task. The system is also demonstrated to be accurate for an electroencephalography (EEG) signal classification problem for epilepsy detection. This paper introduces a memcapacitor-based reservoir computing system that leverages two-terminal, scalable, biomolecular memcapacitors. The biomolecular memcapacitor consists of a synthetic lipid bilayer formed between two lipid-encased aqueous droplets submerged in an oil phase. At the interface of both droplets, an elliptical, planar lipid bilayer ( 100min radius) spontaneously forms with a highly insulating ( >100Mcm2) core consisting of a mixture of hydrophobic lipid tails and residual entrapped oil. Upon transmembrane voltage application, the ionically charged lipid bilayer manifests geometrical changes due to electrowetting (EW) and electrocompression (EC) (Figure 1A), leading to an increase in bilayer area and a decrease in the hydrophobic, respectively (see Supplementary Section 1 and Figure S1-S4 for more details). The bilayer exhibits dynamical voltage-controlled capacitance with paired-pulse-facilitation (PPF) (Figure 1B) via geometric recongurability of its interfacial area and hydrophobic thickness [54], enabling the high-dimensional temporal transformation with minimal power and energy consumption. First, the authors demonstrate the devices computational quality by conducting spoken-digits recognition as well as predicting a second-order dynamical time series and compare the achieved accuracies one-to-one with another memristor-based RC report [32, 33]. Then, taking advantage of the devices biological synapse- like time-scale (102ms [55]), they solve an electroencephalography (EEG) signal classication problem to demonstrate the devices real-time temporal processing. Furthermore, for completion, they solve an IRIS dataset classification problem to confirm that the devices short-term dynamics can solve a static classification problem (Supplementary Information Section 3). Finally, the authors present the devices power and energy per spike consumption and compare them with the consumption of other state-of-the-art memristors [32, 34, 37, 56]. The paper discusses the use of a memcapacitor-based reservoir for solving three distinct problems, where the input encoding and the feature space definition for each problem were executed differently. The first problem is a benchmark spoken-digit classification problem, where the input signal is binary (either a 0 or a 1) and temporally history-dependent. For this problem, a virtual node was elected for every five equally spaced inputs. The second problem is a regression problem, where the input signal is random, continuous (non-binary), and history-dependent. Unlike the spoken-digit problem, the input signal was encoded in 10 different timescales to effectively increase the dimensionality of the reservoir. The third problem is an EEG signal classification problem as a real-time temporal signal processing problem. Leveraging the devices biological fading memory (100 ms), a feature modification post-process that integrates 60 short-timescale (5.8ms) features into one long-timescale (348ms) cumulative feature (i.e., virtual node) is used. As a supplementary problem, the paper solves a static classification problem, namely the IRIS dataset problem, where the input is history-independent. The paper discusses the use of a memcapacitive RC system to solve a second-order nonlinear dynamic task. The system is trained to map a random input onto a higher-dimensional space, thereby enabling the generation of an accurate second-order dynamic nonlinear transfer function output from the input after training without prior knowledge of the underlying mathematical relationship between input and output. The system is composed of 50 memcapacitive virtual nodes within the employment of 5 physical memcapacitive devices. For each memcapacitor, the input voltages are provided via pulse streams with a frequency of 10 Hz. The system is trained using a random input signal sequence within the range of 0 to 0.5 and transformed it into a voltage amplitude between 50 mV and 200 mV. The system is tested with a random input signal sequence and is able to generate an accurate second-order dynamic nonlinear transfer function output. This paper discusses a memcapacitor-based RC system for solving second-order dynamic tasks. The system is trained using a linear regression algorithm, and the results show that it is effective for both training and testing data. Additionally, the system is able to solve classication problems in real-time. The paper presents a memcapacitive RC system that is capable of classifying data accurately, even in the absence of complete input data. The system is also able to process static, real-time data classification problems. The system is compared with a conventional linear network and is shown to be superior in terms of prediction NMSE. The paper discusses a new method for reservoir computing that uses memcapacitive devices. The method is demonstrated to be orders of magnitude more energy efficient than state-of-the-art memristors, and is also shown to be independent of the pulse width of the input voltage. The method has potential applications in a wide range of industries, including action recognition, prediction, and classification. This paper reviews recent advances in the field of deep learning, with a focus on recurrent neural networks (RNNs). It discusses the various types of RNNs and their applications in speech recognition and stock price prediction. The paper also reviews the challenges associated with training RNNs, and highlights some promising future directions for research. This paper discusses the use of spintronic devices for physical reservoir computing, which is a type of artificial intelligence that can be used for temporal information processing. The paper describes the use of spintronic devices for reservoir computing, and how they can be used to improve the performance of reservoir computing systems. The paper also describes the use of spintronic devices for other applications, such as spoken digit classification and forecasting. This paper reviews the use of reservoir computing and extreme learning machines for non-linear time-series data analysis. It discusses the advantages of using these methods for data analysis, including their ability to handle non-linear data and their computational efficiency. The paper also reviews some of the challenges associated with using these methods, including the need for careful tuning of the reservoir parameters and the difficulty of training the machines to handle complex data. The paper reports on the development of a memcapacitor, a two-terminal nonlinear energy storage element that exhibits memory properties. The memcapacitor is made of a lipid bilayer and can be categorized as either nonvolatile or volatile, depending on whether the memcapacitor's states are maintained or not upon removal of an electrical stimulus. The device developed by Najem et al. is volatile, meaning that its states are not maintained upon removal of an electrical stimulus. The device exhibits short-term memory, whereby the magnitude of capacitance nonlinearly depends on one or more internal states and can be regulated based on present and past external stimulation. The paper discusses a memcapacitor-based system for classifying spoken digits. The system is designed to be used in conjunction with an EEG system for classifying healthy and epileptic brain signals. The memcapacitor system is able to achieve high accuracy rates in both simulation and experimental settings. The paper discusses the use of a memcapacitive reservoir system for classification of the Iris dataset. The system is trained using logistic regression. The results of the experimentation showed a recognition rate of approximately 98.33%. This highlights the potential of memcapacitive reservoir systems in handling complex datasets and further confirms their capability as a promising alternative to traditional machine learning algorithms. This paper presents a memristive ion channel-doped biomembrane as a synaptic mimic. The memristive ion channel is used to modulate the synaptic strength, and the biomembrane is used to provide a biologically realistic environment. The paper discusses the use of the memristive ion channel-doped biomembrane for temporal information processing and data classification.
2305.15395.pdf: The paper discusses the importance of prediction accuracy for voltage regulation in active distribution networks. Current prediction models aim at minimizing individual prediction errors but overlook their collective impacts on downstream decision-making. The paper proposes a safety-aware semi-end-to-end coordinated decision model to bridge the gap from the downstream voltage regulation to the upstream multiple prediction models in a coordinated differential way. The semi-end-to-end model maps the input features to the optimal var decisions via prediction, decision-making, and decision- evaluating layers. It leverages the neural network and the second- order cone program (SOCP) to formulate the stochastic PV/load predictions and the var decision-making/evaluating separately. Then the var decision quality is evaluated via the weighted sum of the power loss for economy and the voltage violation penalty for safety, denoted by regulation loss. Based on the regulation loss and prediction errors, the paper proposes the hybrid loss and hybrid stochastic gradient descent algorithm to back-propagate the gradients of the hybrid loss with respect to multiple predictions for enhancing decision quality. Case studies verify the effectiveness of the proposed model with lower power loss for economy and lower voltage violation rate for safety awareness. The paper discusses the gap between downstream decision-making and upstream predicting in voltage regulation under PV penetration. It proposes the safety-aware semi-end-to-end coordinated decision model for voltage regulation, which considers the decision quality by integrating the downstream regulation model in training the upstream multiple PV/load prediction models via a coordinated differential way. The semi-end-to-end decision model leverages the neural network (NN) models (intractable) to formulate the stochastic PV/load power and the second-order convex program (SOCP) models (tractable) to formulate the optimal var decision- making. Its main framework includes three layers, mapping from the input features to multiple PV/load predictions by the NN-driven prediction layer, from predictions to var decisions by the SOCP-driven decision-making layer, and from the var decisions to the decision quality evaluating by the SOCP-driven decision-evaluating layer. It evaluates the var decision quality under predictions by the weighted sum of the power loss for economy and the voltage violation penalty for safety awareness, denoted by regulation loss . The gradients of the regulation losswith respect to the predicted PV/load are derived from the SOCP-driven decision-making and evaluation through SOCP derivative solution map, including skew-symmetric mapping,homogeneous self-dual embedding, and solution construction. Combined with prediction errors, we propose the safety-aware semi-end-to-end hybrid stochastic gradient descent (HSGD) learning algorithm to back-propagate the hybrid gradients to multiple prediction models for reducing regulation loss and enhancing the decision quality. So the principal contributions of this paper can be summarized threefold: 1) To the authors best knowledge, this paper, for the first time, takes advantage of the downstream voltage regulation model to train the multiple prediction models in a coordinated differential way for enhancing decision quality. Compared to conventional prediction approaches [3][8], the proposed semi- end-to-end model utilizes the reverse impact of downstream decision-making on the multiple upstream predictions to improve the decision quality under the multiple coordinated predictions. 2)Regulation loss is proposed to evaluate the decision quality of voltage regulation, including the economic part for minimizing power loss and the safety-aware part for guaranteeing safe operation. The gradients of regulation loss to multiple predictions are derived by SOCP derivative solution map and back-propagated to the multiple prediction models for The paper proposes a semi-end-to-end learning framework for voltage regulation that takes into account the impact of multiple prediction errors on the downstream voltage regulation model. The framework is designed to train the multiple prediction models in a coordinated differential way in order to improve decision quality. The case study presented in the paper demonstrates the effectiveness of the proposed framework. The paper discusses a semi-end-to-end learning model for voltage regulation. The model is composed of three layers, with coordinated differential forward / backward propagation procedures. The first layer maps the input feature contexts to PV/load predictions by NN; the second layer maps the multiple predictions to var regulation decisions by the SOCP- driven voltage regulation model; the third layer maps the var decisions to the hybrid loss function. The decision quality from multiple predictions is measured for training prediction models reversely. The learning processes of multiple prediction models are coupled through the SOCP-driven voltage regulation model. The paper proposes a regulation loss function that takes into account both the power loss for economy and the voltage violation penalty for safety. The decision model is formulated as a parameterized SOCP problem, which is then solved using a three-stage solution map. The first stage maps the input parameters to a skew-symmetric matrix, the second stage solves the homogeneous self-dual embedding, and the third stage maps the solution to the optimal primal-dual solution. The total derivative of the solution map is then derived using the chain rule. The paper presents a safety-aware semi-end-to-end coordinated decision model for voltage regulation. The model integrates multiple prediction models for PV and load power into a voltage regulation model, in order to improve decision quality. The paper proposes a hybrid training strategy that considers both prediction accuracy and decision quality, and a stochastic gradient descent learning algorithm for training the model. The algorithm is tested on a 33-bus radial distribution network, and the results show that the proposed model outperforms existing models in terms of power loss and safety awareness. The paper presents a semi-end-to-end decision model for voltage regulation in distribution networks with high penetration of renewable energy sources. The model integrates prediction models for PV generation and load demand with a downstream optimization model for voltage regulation. The model is trained using a hybrid stochastic gradient descent algorithm, which combines conventional gradient descent with a new method called Hessian-based stochastic gradient descent. The paper evaluates the performance of the model in terms of power loss reduction and safety-awareness. Results show that the model outperforms conventional methods in terms of power loss reduction. The paper compares the power loss and regret of two models, the SE2E model and the MSE model. The SE2E model is designed to improve the decision quality from multiple prediction upgrades, while the MSE model is designed to focus on the overall prediction accuracy. The study found that the SE2E model decreases the decision-making regret by 52.72% on average than MSE, and the average power loss under SE2E is 0.58 kW lower than that under MSE. However, the load prediction accuracy degrades in the SE2E model. The study also found that the proposed SE2E-based model exhibits a more secure voltage operation than the MSE model, verifying its safety awareness. The paper proposes a safety-aware semi-end-to-end coordinated decision model for voltage regulation by connecting the NN-driven prediction and SOCP-driven regulation models for high decision quality. Regulation loss is designed to evaluate the economy and safety awareness of decision-making. Then the HSGD learning algorithm is proposed to train the multiple PV/load prediction models in a coordinated differential way. Case studies verify that compared to the conventional predict- and-optimize model, the proposed decision model achieves the regulation economy with lower power loss in the economic scenario and the safety awareness with a lower voltage violation rate in the safety-aware scenario. The paper discusses the use of second-order cone programming to develop a scalable and compatible voltage regulation model. The model is designed to be used with inverter-based distributed generators, and includes constraints and objectives related to the regulation of voltage and power output. The paper includes a discussion of the decision-making and decision-evaluating layers of the model, as well as a comparison of the proposed model to existing methods. The paper discusses federated reinforcement learning for decentralized voltage control in distribution networks. It reviews previous work on optimal control of DERs in ADNs, voltage regulation in distribution networks, and state estimation in distribution networks. It then describes a new approach to voltage control using federated reinforcement learning, and provides results of simulations comparing the new approach to previous methods. The paper discusses the research interests of a professor at Tsinghua University in China, which include energy management systems, active distribution system operation and control, and machine learning. The author was awarded the National Science Fund of China Distinguished Young Scholar Award in 2017.
2305.15504.pdf: The paper develop an adaptive state vector observer for a nonlinear time variant system with a measurable output. The task is solved under assumption that the input matrix (vector) and the nonlinear component in the state equation of the system contain unknown constant parameters. The developed adaptive observer is based on generalized parameter estimation based observer (GPEBO) approach, which was proposed in [1]. During observer synthesis preliminary parametrization of the initial nonlinear system was made. After that the obtained system is transformed into linear regression model. The next step of the algorithm is estimation of the unknown linear regression parameters. The least square method with forgetting factor [2, 3] is used for that. In the paper [4] linear time-variant system with unknown parameters in the state matrix and the input matrix (vector) was considered. The current work is to develop the result obtained in [4] for the case when the state equation of the system contains an unknown nonlinear component. The paper discusses the design of a linear controller for a single-input, single-output (SISO) system. The controller is designed to minimize the effect of disturbances on the system. The paper presents a method for designing the controller using an optimization technique called "linear quadratic regulator" (LQR). The LQR approach is used to find the optimal control input that minimizes a cost function. The cost function is a sum of the squares of the system's state variables and the control input. The paper presents a method for designing the controller using an optimization technique called "linear quadratic regulator" (LQR). The LQR approach is used to find the optimal control input that minimizes a cost function. The cost function is a sum of the squares of the system's state variables and the control input. The paper discusses the design of a linear controller for a single-input, single-output (SISO) system. The controller is designed to minimize the effect of disturbances on the system. The paper discusses the use of generalized parameter estimation-based observers for power systems and chemical biological reactors. The observers are designed to provide improved performance over traditional observers. The paper provides theoretical analysis of the observers and demonstrates their effectiveness through simulations. This literature review looks at several papers on observer design for linear time-varying systems with unknown inputs. The first paper looks at the design of linear unknown input observers for sensor fault estimation in nonlinear systems. The second paper looks at the design of an adaptive state estimation observer for state-affine systems with unknown time-varying parameters. The third paper looks at the design of a robust time-varying parameters estimation based on the I-DREM procedure. The fourth paper looks at the design of an adaptive observer for actuator and sensor fault diagnosis in linear time-varying systems. The fifth paper looks at the design of a fault diagnosis observer for linear time-varying systems based on high gain adaptive compensation sliding mode observers. The sixth paper looks at the design of an observer for a LTV system with partially unknown state matrix and delayed measurements. The seventh paper looks at the design of an observer for a time varying system with unknown inputs. The eighth paper looks at the design of a parameter estimator via dynamic regressor extension and mixing.
2305.15549.pdf: The paper discusses a method for improving the accuracy of soil moisture estimation by simultaneously estimating a subset of soil hydraulic parameters. The proposed method uses microwave remote sensing to obtain soil moisture observations, and the extended Kalman filter to assimilate these observations into a field model. The effectiveness of the proposed methodology is demonstrated through numerical simulations and a real field experiment. Results show that the proposed method can improve soil moisture estimation accuracy by 24-43%. This paper discusses a method for estimating soil moisture and hydraulic parameters simultaneously, using an extended Kalman filter. The authors first use sensitivity analysis and orthogonal projection to identify a subset of hydraulic parameters that can be uniquely estimated. They then use the extended Kalman filter to estimate the soil moisture and the selected hydraulic parameters. Finally, they use Kriging interpolation to interpolate the estimates of the estimable hydraulic parameters across the field. This study is novel in its use of sensitivity analysis and orthogonal projection methods to evaluate the estimability of soil hydraulic parameters before the simultaneous estimation of soil moisture and hydraulic parameters. The paper discusses a new method for estimating soil moisture content in an agricultural field. The method is based on the Richards equation, which is a difficult equation to solve. The paper describes several mechanisms that are used to guarantee a reliable numerical solution of the equation. The paper also describes how the state-space representation of the field model can be used to estimate the soil moisture content. The paper discusses a proposed approach for simultaneous state and parameter estimation using the sensitivity analysis method and the orthogonal projection method. The authors argue that this approach is suitable for a system in which the measured locations of the field change over time, as is the case with a center pivot irrigation system. The proposed methodology is illustrated with an example in which the field is partitioned into 40 sectors. It is shown that this approach can be used to group all the scaled output sensitivity matrices that share the same measured states into a single matrix. The paper discusses the use of the extended Kalman filter for simultaneous state and parameter estimation in a field model. The authors note that some parameters may be non-estimable, and thus use the orthogonal projection technique to identify the most estimable parameters. This approach is applied to different sectors of the field to determine the corresponding most estimable parameters for measurements from different sectors. The authors note that this approach can be used to simplify the online estimation of states and selected parameters. The paper presents a method for estimating soil moisture and hydraulic parameters using a Kriging interpolation approach. The method is tested using simulated microwave sensor measurements. The results show that the proposed method is effective in estimating soil moisture and hydraulic parameters. The paper presents a method for estimating soil moisture and soil hydraulic parameters simultaneously. The method is evaluated through three different cases, with the most accurate results coming from the third case, in which both soil moisture and soil hydraulic parameters are estimated. The method is then applied to a real case study, with the results indicating that the proposed approach is accurate and efficient. The paper discusses the use of a numerical model to study the irrigation of a field with a center pivot system. The field is discretized into a number of radial and azimuthal sectors, and the model is used to estimate the state of the field (moisture content, etc.) at each node. Data from microwave radiometers is used to infer the movement of the center pivot and to map measurements to the nodes of the field model. Ancillary weather data is used to determine the crop coefficient for barley. The model is validated using measurements from July. The paper presents a method for estimating soil moisture and hydraulic parameters simultaneously using a state and parameter estimator. The estimator is initialized with a covariance matrix that takes into account all the possible estimable parameters. The EKF is initialized with xa(0) and a diagonal covariance matrix Pa(0|0), which are set based on limited information available about xa(0). The covariance matrices of process uncertainty Q and measurement noise R are considered as tuning parameters in this study. The performance of the state and parameter estimation approach is assessed with two types of cross-validation. In the first type, the measurements acquired at each sampling time are randomly split into a training set and a validation set. The estimates provided by the state and parameter estimator are compared with the measurements in the validation set. In the second type of cross-validation, all the measurements obtained on a specific day during the simulation period (specifically July 21st, 2021) is used for validation. The normalized root mean square error (NRMSE) is used to quantify the performance of the proposed approach. To further highlight the advantages of estimating soil moisture and hydraulic parameters simultaneously, the results of two additional case studies are presented and analyzed together with the results of the proposed approach. The paper discusses the results of a study that sought to improve the accuracy of soil moisture estimates by estimating soil hydraulic parameters. The study used a type of cross-validation called the leave-one-out method, which is a statistical technique that allows for the assessment of the accuracy of predictions made by a model. The study found that the proposed approach for estimating soil hydraulic parameters can significantly improve the accuracy of soil moisture estimates. The paper proposes a method for simultaneously estimating soil moisture and soil hydraulic parameters in large-scale agro-hydrological systems with soil heterogeneity by utilizing soil moisture observations acquired through microwave radiometers installed on center pivot irrigation systems. The proposed approach involves modeling the field under study with the cylindrical coordinate version of the Richards equation, employing the sensitivity analysis and the orthogonal projection methods to address issues of parameter estimability, and assimilating the remotely sensed moisture observations into the field model using the extended Kalman filtering technique. The outcomes of sensitivity analysis and orthogonal projection methods show that the hydraulic parameters of the measured nodes in the field model are the most estimable, and these parameters, along with all states of the field model, can be reliably estimated. Simulated and real case studies were conducted, revealing that the proposed approach enhances the accuracy of soil moisture estimation while providing reliable estimates of hydraulic parameters. Therefore, it can be concluded that the proposed method can provide accurate soil moisture information for the design of closed-loop irrigation. Despite the promising results, other modifications or directions are worth exploring. The paper reviews various methods for estimating hydraulic parameters using remote sensing data. The particle filter is found to be the most accurate method for estimating soil moisture content. The extended Kalman filter is also found to be accurate for estimating soil moisture content, but is less accurate for estimating hydraulic parameters. The dual state-parameter estimation approach is found to be the most accurate for estimating both soil moisture content and hydraulic parameters simultaneously.
2305.15557.pdf: This paper proposes a new non-parametric learning paradigm for the identification of drift and diffusion coefficients of non-linear stochastic differential equations, which relies upon discrete-time observations of the state. The key idea is to fit a RKHS-based approximation of the corresponding Fokker-Planck equation to such observations, yielding theoretical estimates of learning rates which, unlike previous works, become increasingly tighter when the regularity of the unknown drift and diffusion coefficients becomes higher. The method is kernel-based, and offline pre-processing may be leveraged to enable efficient numerical implementation. The paper proposes a method for learning the laws of a stochastic differential equation from independent discrete-time observations of the state. The method is composed of two steps: (1) learning the densities of the solutions to the stochastic differential equation through a kernel-based model, and (2) learning finite-dimensional models for the drift and diffusion coefficients by fitting approximated solutions to the Fokker-Planck equation. The main benefit of the method is that it can be used to accurately approximate the laws of solutions to the stochastic differential equation without resorting to conservative families of parametric densities. In addition, the learning rates for the identification of the drift and diffusion coefficients of the stochastic differential equation are provided. This paper provides a meta-theorem on the accuracy of a two-step identification method for learning stochastic differential equations with fast rates. The method is based on observing the process solution to a stochastic differential equation with unknown drift and diffusion coefficients, and then estimating the coefficients up to an error of O(log(1)). This result has important implications in observation and regulation of stochastic differential equations, and may represent a good starting result to develop paradigms for the identification of controlled stochastic differential equations, which are crucial for the control of complex autonomous systems. The paper discusses the relationship between solutions to stochastic differential equations (SDEs) and the Fokker-Planck equation (FPE). It is shown that solutions to the FPE can be used to approximate solutions to SDEs. Additionally, it is shown that the solutions to the FPE are absolutely continuous, and that they satisfy the Strong Fokker-Planck Equation (SFPE). These results are then used to study the accuracy of a learning approach for approximating the observation/regulation metric in (3.4). The paper discusses a method for learning the coefficients of a stochastic differential equation (SDE) from data. The authors first approximate the unknown solution to the SDE using a reproducing kernel Hilbert space (RKHS)-based model. They then use this model to approximate the coefficients of the SDE that best match the data. Finally, they make the learning problem tractable by approximating the coefficients using a finite-dimensional RKHS-based model. The authors show that their method provides theoretical error bounds in the context of observation and regulation of stochastic differential equations. The paper discusses the approximation of a probability measure p via a model density /hatwidep. The authors prove that there exists a constant C >0, which depends on n,m, andT uniquely, such that the following learning rate for the rand om model/hatwidepholds with probability at least 1, for every M,NNsuch thatM2T: L(/hatwidep)C/parenleftbig /badblp/badbl2 Hm+1,2+/badblp/badbl2 H1,2(m+1)/parenrightbig /parenleftigg M2m+R4m+Rnlog/parenleftbigg4M /parenrightbigg2 N1/parenrightigg . The paper bounds the error of estimating the coefficients of a stochastic differential equation with fast rates. The paper uses a random mapping to bound the error. The paper also bounds the error of estimating the coefficients of a stochastic differential equation with a model density. The paper uses a random mapping to bound the error. The paper also bounds the error of estimating the coefficients of a stochastic differential equation with a model density. The paper discusses the problem of learning stochastic differential equations with fast rates. The main result is an appropriate learning error estimate which can be used to obtain learning error estimates for the problem of learning through nite-dimensional models. The estimate depends on the parameters defining the learning problem, and the authors show that the result can be applied to a wide range of applications. The paper discusses a method for learning stochastic differential equations with fast rates. The main result is that, with probability at least 1, the error between the unknown densities and the densities stemming from the learned coefficients is bounded by C(a,b)log(1/), where the constant C(a,b)>0 depends on a and b uniquely. The paper introduces a method for learning the coefficients of a stochastic differential equation with fast rates. The proposed method is based on solving a finite-dimensional optimization problem, which is a natural extension of the original problem. The main result of the paper is that, with probability at least 1, the proposed method converges to the true coefficients at a rate of log(1/epsilon). The paper studies a learning paradigm for the identification of drift and diffusion coefficients of non-linear stochastic differential equations, based on discrete-time observation of the state. Under assumptions of smoothness for the unknown coefficients, the authors provide theoretical estimates of learning rates which become increasingly tighter when the number of observations and the regularity of the unknown coefficients grow. The paper discusses the well-posedness of stochastic differential equations (SDEs) and proves a theorem on the uniqueness of solutions to SDEs. The theorem states that for every (a,b) H+ m, there exists a unique measurable mapping X:RnS which solves SDE with coefficients (a,b) H+ m. In addition, the theorem states that the mapping X:RnS is continuous and that the process ( X(x,))xRn is a modification of (X(x,))xRn. The paper establishes that there is a one-to-one correspondence between solutions to the Fokker-Planck equation and solutions to a stochastic differential equation. It also provides a method for finding solutions to the Fokker-Planck equation, and gives conditions under which a solution to the Fokker-Planck equation is unique. The paper discusses an operator O that can be used to compute the solution to a Fokker-Planck equation with coefficients (a,b) in H+m. The operator O is shown to be measurable and in L1, and the solution p is shown to be in Hm+1. The paper discusses the existence of a linear and bounded operator E: H1([0,T]Rn,R)H1(Rn+1,R) such that there exists a constant C >0 such that Eu|[0,T]Rn=u for every uH1([0,T]Rn,R). The proof is based on the classical Lions scheme and uses Gronwall's inequality. The operator E can be used to solve FPE fwith coecients (a,b) H+ m,>0. If f= 0 and pis a non-negative density in L2(Rn,R), then p(t,x)0 for every t[0,T] and almost every xRn. The paper provides a proof of a theorem related to the learning of stochastic differential equations with fast rates. The proof is divided into several steps, with the first step providing a second parabolic estimate. This estimate is used to bound the last two terms in a equation related to the learning process. The convergence of a sequence is also shown, which allows for the completion of the proof of the theorem. The paper provides a detailed analysis of the regularity of solutions to a class of stochastic partial differential equations (SFPEs). In particular, the authors consider a class of SFPEs with fast rates, and establish various results regarding the regularity of solutions to these equations. In particular, they show that solutions to these equations are typically second-order in time and fourth-order in space. As a result, the authors conclude that these equations can be effectively learned using standard methods from the literature. The paper discusses the existence and uniqueness of solutions to an integro-differential variational problem, as well as the regularity of these solutions. The problem is motivated by the need to estimate the drift of a diffusion process. The authors use a reproducing kernel Hilbert space approach to show that the problem has a unique solution, which is twice differentiable in time and fourth-order differentiable in space. This solution can be used to estimate the drift of the diffusion process. This paper discusses the estimation of stochastic differential equations (SDEs) from discrete observations. The authors consider both parametric and nonparametric estimation, and discuss various methods for each case. They also discuss the use of online gradient descent for estimation of SDEs, and provide some results on the convergence of this method.
2305.15573.pdf: The paper provides a semi-global exponential stability (SGES) result for the combined attitude and position rigid body motion tracking problem. Dual quaternions are used to jointly represent the rotational and translation tracking error dynamics of the rigid body. A novel nonlinear feedback tracking controller is proposed and a Lyapunov based analysis is provided to prove the SGES of the closed-loop dynamics. The analysis does not place any restrictions on the reference trajectory or the feedback gains. This stronger SGES result aids in further analyzing the robustness of the rigid body system by establishing input-to-state stability (ISS) in the presence of time-varying additive and bounded external disturbances. The paper discusses the use of quaternions for rigid body motion control, and presents a new feedback control law that guarantees semi-global exponential stability of both attitude and position states. The control law is robust in the presence of external disturbances and motion constraints, and does not require any prior knowledge of the bounds on the reference trajectory or body inertia. The paper proposes a nonlinear feedback tracking control law for rigid body dynamics that is based on dual quaternions. The control law is designed to guarantee asymptotic and semi-global exponential stability for the closed-loop system. A Lyapunov function is used to show asymptotic stability, and the semi-global exponential stability result is then shown by using properties of the control law and the system dynamics. The paper provides a feedback control law that renders the tracking error dynamics semi-global exponential stability stable. To do so, the paper first provides a definition of semi-global exponential stability, before stating the main result of the paper. The main result states that, under the feedback control law provided, the tracking error dynamics are semi-globally exponentially stable. The proof of this result is provided, along with expressions for the parameters involved. The paper provides a proof that a closed-loop system consisting of a dual quaternion-based rigid body dynamics with additive bounded disturbances is locally input-to-state stable. This result is obtained by using Lyapunov analysis and the fact that the system is continuously differentiable. This paper discusses the use of Control Barrier Functions (CBFs) to generate safe control inputs for spacecraft rendezvous and docking. The CBF function is defined as a composite surface that includes a curve based on the cissoid, a partial spherical surface, and a third section that links the two. The control inputs are synthesized by solving a Quadratic Program (QP) that minimizes the force applied by the chaser spacecraft while still ensuring that the motion and safety constraints are met. The paper discusses a new type of controller that is used for safe spacecraft rendezvous. The controller is an exponentially stabilizing controller that is shown to be more effective than traditional PID controllers. The controller is tested on three realistic scenarios and is shown to be successful in each case. The paper discusses a new feedback control law for spacecraft that is based on dual quaternions. The law is designed to enable the spacecraft to autonomously dock with the International Space Station. The authors demonstrate the efficacy of the law in realistic scenarios, such as the MarCO mission, the Apollo transposition and docking problem, and the docking of SpaceX Dragon 2 with the International Space Station. The paper reviews several existing papers on the topic of attitude control for spacecraft. The first paper reviews the problem of attitude control without angular velocity feedback. The second paper reviews the problem of rigid body attitude tracking without angular velocity feedback. The third paper reviews the problem of simultaneous position and attitude control without linear and angular velocity feedback. The fourth paper reviews the problem of rigid body motion tracking without linear and angular velocity feedback. The fifth paper reviews the problem of nonlinear pose control and estimation for space proximity operations. The sixth paper reviews the problem of control barrier functions under high relative degree and input constraints for satellite trajectories. The seventh paper reviews the problem of ellipse cissoid-based potential function guidance for autonomous rendezvous and docking with non-cooperative target. The eighth paper reviews the problem of guaranteed safe spacecraft docking with control barrier functions. The ninth paper reviews the problem of robust control barrier functions under high relative degree and input constraints for satellite trajectories. The tenth paper reviews the problem of CubeSat proximity operations demonstration (CPOD) mission. The eleventh paper reviews the problem of mass properties for the CSM/LM spacecraft. The twelfth paper reviews the problem of constraints and performance for the CSM/LM spacecraft.
2305.15589.pdf: This paper focuses on the technology of connected and automated driving vehicles and how they are implemented, rather than on the pure science behind them. The paper describes the modeling of the vehicle, the validation of the model, the hardware-in-the-loop system used in development work, and the hardware architecture and implementation details. It also presents some experimental results. The paper presents a simple vehicle dynamics model that can be used to simulate the behavior of automated vehicles. The model is validated using experimental data and shown to provide good results. The model is then used to conduct simulations in a hardware-in-the-loop environment to test the integration of various automated vehicle subsystems. The paper discusses the results of the ISO 3888-1 double lane change test, which is used to assess a vehicle's ability to change lanes and then return to its original lane. The test is conducted by measuring the steering wheel angle and the vehicle velocity, and comparing the results to a simulation done in CarSim. The results show that the CarSim model is a good representation of the experimental data. The paper also discusses the use of a drive-by-wire system to control the throttle, brake, and steering actuators. The system uses a dSPACE MicroAutoBox (MABX) electronic control unit to automate the throttle, brake, and steering. The MABX controller is programmed with a PI type position controller for steering control. A program in the SMART motor interprets the signal from the MABX controller and produces a PWM signal to drive the motor. For demonstration of autonomous driving, a simple path following control algorithm is implemented. The desired path is created by choosing the desired GPS waypoint positions (WPs) using a map application. After sending the WPs to the controller, the path following algorithm creates a bearing vector between the vehicle position and the target WP. Then, the angle between the bearing vector and the heading vector is calculated and a PID type controller is used to keep the angle at zero while controlling the steering. A Delphi ESR Radar, an Ibeo LUX four channel lidar, and an Xsens GPS sensor with built in IMU and INS algorithm are used for the environmental sensing and localization. The radar sensor is used to detect the range and the speed of the followed vehicle in the cooperative/adaptive control (CACC and ACC) system. The lidar sensor is mostly used for detecting closer objects in autonomous driving. The paper presents a light commercial vehicle that has been modified for automated driving. The software, hardware, control, and decision-making architecture of the vehicle are described in detail. The vehicle has been equipped with actuators, sensors, controllers, and a modem for by-wire automation. The localization and environmental sensors used are explained in detail. The paper ends with exemplary test results with CACC testing in the longitudinal direction and automated path following steering control in the lateral direction. The paper discusses the design of a connected and autonomous vehicle platform. The platform is designed for use in a variety of applications, including highly automated driving. The paper describes the platform's hardware and software, as well as its testing and validation. The platform is shown to be effective in a variety of scenarios, including path following and collision avoidance. The paper evaluates the fuel economy benefit of a Pass-at-Green (PaG) application on urban routes with STOP signs. The PaG application is a V2I application that allows vehicles to pass through intersections without stopping. The study found that the PaG application can reduce fuel consumption by up to 10%. The paper also discusses a cooperative ecological cruising strategy for connected and automated vehicles. The strategy is designed to optimize sustainable performance for vehicles on varying road conditions. The study found that the strategy can improve fuel economy by up to 20%. Finally, the paper presents a comprehensive eco-driving strategy for connected and autonomous vehicles. The strategy is designed to improve fuel efficiency and mobility of traffic networks. The study found that the strategy can improve fuel economy by up to 30%.
2305.15602.pdf: The paper proposes a new approach to reinforcement learning (RL) called control invariant set (CIS) enhanced RL. The approach uses the explicit form of CIS to improve stability guarantees and sampling efficiency. It consists of two learning stages: offline and online. In the offline stage, CIS is incorporated into the reward design, initial state sampling, and state reset procedures. This incorporation of CIS facilitates improved sampling efficiency during the offline training process. In the online stage, RL is retrained whenever the predicted next step state is outside of the CIS, which serves as a stability criterion, by introducing a Safety Supervisor to examine the safety of the action and make necessary corrections. The stability analysis is conducted for both cases, with and without uncertainty. To evaluate the proposed approach, the authors apply it to a simulated chemical reactor. The results show a significant improvement in sampling efficiency during offline training and closed-loop stability guarantee in the online implementation, with and without uncertainty. The paper discusses how to use RL in conjunction with a control invariant set (CIS) in order to improve safety and efficiency. The authors propose a two-stage approach that involves offline training with a process model, followed by online learning when the safety constraint is violated. A new control implementation strategy, called the Safety Supervisor, is introduced to ensure closed-loop stability. The algorithm is designed to accommodate different control objectives, and is applied to a chemical reactor to demonstrate its effectiveness. The paper presents a CIS-enhanced reinforcement learning (RL) approach for closed-loop stability and improved sampling efficiency. In the proposed approach, the RL is trained offline with CIS information to achieve a near-optimal policy. The incorporation of the CIS in the offline training can significantly improve its sampling efficiency since the amount of data needed for training is reduced. While the CIS is used in offline training, the offline trained policy does not guarantee the closed-loop stability. In order to ensure the closed-loop stability, the RL should further be trained online with the CIS information. The paper proposes a new approach for training a reinforcement learning agent to ensure closed-loop stability. The key idea is to use the information from a known control invariant set (CIS) to design the reward function and to sample initial states for training. The paper demonstrates that this approach can improve learning efficiency and lead to more stable policies. The paper discusses a method for designing a reinforcement learning algorithm that is both stable and achieves optimal performance. The method involves offline training of the RL agent, followed by online implementation of the learned policy using a stability guaranteed strategy. The algorithm is further updated online when new situations are encountered. A safety supervisor is placed between the RL agent and the environment to ensure closed-loop stability. The backup table is used to save safe actions for those states where the online training fails to find a safe action within the maximum number of iterations. This approach provides a safety guarantee for the RL while still allowing it to achieve optimal performance. The paper presents a CIS-enhanced RL approach that ensures closed-loop stability during real-time implementation, enabling its application in safety-critical processes. The approach is adapted to consider an RCIS in the presence of model uncertainty, which excludes the states becoming control variant due to uncertainties, from the safe zone for RL exploration. The framework for the stochastic case remains the same as for the deterministic case, with essential modifications required for offline learning and online implementation. The paper discusses a modification to the Safety Supervisor in RL that involves an optimization problem to ensure the safety of an action. The objective function P(xk+1/Rr) denotes the probability of xk+1being outside of Rr. If the optimal value J= 1, then there exists a disturbance wkthat can be applied to the system to make the actual state xk+1 violate the safe set Rr. The probability serves as a conceptual objective, meaning that it does not necessarily need to be explicitly defined or calculated. Instead, in this work, the explicit form of the RCIS is used to define the probability indirectly. Users can define their own probability objective function, either directly or indirectly, depending on the nature of the safety constraints and the specific requirements of the application. The optimization problem (15) is non-smooth due to the presence of the max operator, which makes it unsuitable for gradient-based optimization algorithms that require differentiable objective functions. To overcome this issue, a new variable zis introduced to represent the maximum element inAxk+1b, leading to a reformulated optimization problem as follows: J= max z,yi,wkz (16) s.t. zAixk+1bi+Myi, i= 1, ..., c (17) cX i=1yi=c1 (18) yi={0,1}, i= 1, ..., c (19) xk+1=f(xk, uk, wk) (20) wkW (21) where there are cbinary variables yicorresponding to the caffine constraints in Axk+1b, with each yiindicating whether the i-th constraint in (17) is active ( yi= 0) or inactive ( yi= 1). Constraint (18) forces exactly c1 constraints to be inactive, with the remaining one being active. The active constraint with yi= 0 results in its corresponding equation in (17) converted to z Aixk+1bi, which provides an upper bound on the maximum element in Axk+1b. Since the optimal value Jrepresents the maximum element in Axk+1b, the constraint JAixk+1biis no longer satisfied for the inactive constraints. To ensure that JAixk+1bi+Myiholds for all constraints in The paper presents a method for training an RL agent to control a continuously stirred tank reactor (CSTR) such that the system remains within a specified region of operation. The RL agent is trained using the proximal policy optimization (PPO) algorithm, and the training is conducted in an episodic fashion. Simulation results show that the proposed method is effective in maintaining the system within the specified region of operation. The paper proposed a new RL training setup that utilizes the knowledge of the controlled invariant set (CIS) to improve the sampling efficiency. The setup was tested and compared with an RL training setup that did not utilize the CIS information. The results showed that the RL training setup that utilized the CIS information was able to achieve a lower failure rate than the RL training setup that did not utilize the CIS information. The paper examines the benefits of the proposed online implementation of an RL agent for a closed-loop system. The agent is trained offline and then implemented online, interacting with the environment for 10,000 episodes. The proposed online implementation is stability guaranteed and results in a better RL agent, as evidenced by a comparison of the failure rates for the offline-trained and online-trained agents. The paper discusses a method for training an RL agent to control a chemical reactor while considering safety and economic objectives. The agent is trained offline using MILP optimization to keep the system within a safe operating region, and online retraining is conducted as needed to account for uncertainty. The results show that the agent is able to maintain stability and achieve good economic performance. This paper proposes a novel approach for combining control invariant set (CIS) with reinforcement learning (RL) in order to achieve stability-guaranteed RL implementation. The authors argue that this approach offers a promising framework for achieving stability-guaranteed RL control while optimizing control objectives in complex systems. The paper is divided into two parts: an offline training stage and an online implementation stage. In the offline training stage, the CIS is integrated into the reward function design, initial state sampling, and state reset technique. In the online implementation stage, the algorithm is enhanced by incorporating Mixed Integer Linear Programming (MILP) to provide stability guarantees in the presence of uncertainty and stability proofs are provided for both deterministic and uncertain cases. The proposed approach is applied to a two-dimensional nonlinear CSTR system and the results demonstrate improved performance compared to RL without CIS. The paper discusses various methods for safe reinforcement learning, including shielding, projection on a safe set, and robust model predictive shielding. It compares these methods and discusses their advantages and disadvantages. It also describes a new method for computing control invariant sets of constrained nonlinear systems, which is more efficient and scalable than existing methods.
2305.15635.pdf: The paper introduces the Vehicle-in-Virtual-Environment (VVE) method of safe, efficient and low cost connected and autonomous driving function development, evaluation and demonstration. The VVE method is compared to the existing state-of-the-art. Its basic implementation for a path following task is used to explain the method where the actual autonomous vehicle operates in a large empty area with its sensor feeds being replaced by realistic sensor feeds corresponding to its location and pose in the virtual environment. The paper discusses the advantages of the VVE method over the existing state-of-the-art, including the ability to easily change the development virtual environment and inject rare and difficult events which can be tested very safely. The paper also presents experimental results for a pedestrian safety application use case. The paper discusses the need for a more realistic testing environment for autonomous vehicles, one that can more accurately replicate the conditions and challenges that these vehicles will face in the real world. The proposed solution is the Vehicle-in-Virtual-Environment (VVE) method, which uses a simulated environment that can be easily changed and customized. This would allow for a more efficient and effective testing process, as well as reducing the risks associated with testing autonomous vehicles on public roads. The paper discusses the advantages of the Vehicle- in-Virtual -Environment (VVE) method over current approaches for testing autonomous vehicles. The VVE method is expected to be a game changer for the autonomous vehicle industry, legislators, user groups and the public as it will significantly decrease development cost and development time while improving product safety. The VVE method is illustrated with an application use case for pedestrian safety using Vehicle- to-Pedestrian (V2P) communication. The V2P vulnerable road user safety mobile phone app developed in our earlier work in reference [21] is used here for the communication between the AV and pedestrian. The VVE method is an excellent choice here because, along with software based vulnerable road users, it is also possible to use real vulnerable road users that share the same virtual environment and move at displaced and safe locations while the AV in the empty parking lot will perceive them to be on a collision risk path. The paper discusses the use of V2P communication for pedestrian safety. The authors use the Carla simulator and Unreal Engine to demonstrate how V2P communication can be used to avoid collisions. The results of the experiment show that V2P communication can be used to reduce the risk of collisions. The paper discusses a new method for developing, evaluating and demonstrating connected and autonomous driving functions. The method, called the Virtual Vehicle Environment (VVE), uses a realistic, three-dimensional environment to safely test different vehicle and pedestrian interactions. The VVE was used to demonstrate how the method can be used to safely test different scenarios with a real pedestrian and an autonomous vehicle (AV). The results showed that the VVE is a safe, efficient and low-cost method for testing AV functions. 1. The Limited Integrator Model Regulator is a model used to control vehicle steering. It is designed to be robust and has been used in various vehicle control applications. 2. The model has been used to design a yaw stability controller for a light commercial vehicle. 3. The model has also been used to design a lateral stability control system for a fully electric vehicle. 4. The model has been used to design a coordinated longitudinal and lateral motion control system for a four wheel independent motor-drive electric vehicle. 5. The model has also been used to design a real-time controller for a parallel hybrid electric vehicle. 6. The model has been used to design a robust PID steering control system for a highly automated driving vehicle. 7. The model has also been used to design an automated robust path following control system for a low speed autonomous shuttle. 8. The model has been used to design a localization and perception system for a low speed autonomous shuttle in a campus pilot deployment. 9. The model has also been used to design a socially acceptable collision avoidance system for a low speed autonomous shuttle. 10. The model has been used to design a cooperative ecological cruising system for connected automated vehicles on varying road conditions. 11. The model has also been used to design a comprehensive eco-driving strategy for connected and autonomous vehicles. 12. The model has been used to design an autonomous road vehicle path planning and tracking control system. 13. The model has also been used to design a mobile safety application for pedestrians utilizing P2V communication over Bluetooth. 14. The model has been used to design a collision avoidance system for low speed autonomous shuttles with pedestrians. 15. The model has also been used to design a parameter-space based robust gain-scheduling design of automated vehicle lateral control. The paper discusses the use of connected and autonomous vehicles (CAVs) for mobility studies. It describes the use of hardware-in-the-loop (HIL), traffic-in-the-loop (TIL), and software-in-the-loop (SIL) simulations for CAV development and evaluation. The paper also discusses the use of the vehicle-in-virtual-environment (VIVE) method for CAV development and evaluation. The paper presents a design method for controllers with parametric uncertainties, using a mixed sensitivity performance requirement. The method is demonstrated using an adaptive headlight system. The system is designed to meet a mixed sensitivity performance requirement, with the goal of minimizing both the tracking error and the sensitivity to parametric uncertainties. The design method is based on a parameter space approach, and the resulting controller is shown to be robust to parametric uncertainties.
2305.15755.pdf: The paper proposes a reinforcement learning technique to control a linear discrete-time system with a quadratic control cost while ensuring a constraint on the probability of potentially risky events. The proposed methodology can be applied to known and unknown system models with slight modifications to the reward (negative cost) structure. The problem formulation considers the average expected quadratic cost of the states and inputs over the infinite time horizon, and the risky events are modelled as a quadratic cost of the states at the next time step crossing a user-defined limit. Two approaches are taken to handle the probabilistic constraint under the assumption of known and unknown system model scenarios. For the known system model case, the probabilistic constraint is replaced with the Chernoff bound, while for the unknown system model case, the expected value of the indicator function of the occurrence of the risky event in the next time step is used. The optimal policy is derived from the observed data using an on-policy RL method based on an actor- critic (AC) architecture. Extensive numerical simulations are performed using a 2nd-order and a 4th-order system, and the proposed method is compared with the standard risk-neutral linear quadratic regulator (LQR). The paper discusses a method for solving a constrained optimization problem using a policy gradient-based actor-critic (AC) algorithm. The constraint is on the probability of the occurrence of a risky event, which is modelled as the event when the quadratic cost of the states crosses a user-defined limit. The AC algorithm is used to find the locally optimal policy. The paper includes extensive numerical simulations using a 2nd-order and a 4th-order system, and compares the performance of the proposed policy with the risk-neutral LQR. The results show that the proposed policy decreases the occurrences of risky events with a small increase in the quadratic cost when compared with the standard LQR. The paper discusses how to optimize a stochastic control problem when the system model is known or unknown. For the known model case, the probability value in the constraint is replaced by the Chernoff bound. For the unknown model case, the expectation operator is used. The moment generating function is derived for two special cases: when the noise is iid and non-zero mean Gaussian, and when the noise is iid and generated from a Gaussian mixture model. The Actor-Critic method is used to find a deterministic optimal policy. The paper presents a method for training an actor-critic (AC) agent to approximate a policy function that is causal for an unknown system model. The AC agent is trained using a replay memory of past experiences, and the reward function is only used for training. The convergence of the AC algorithm has been theoretically established in certain cases, but a formal proof of convergence for the particular problem formulation is under study. Numerical simulations using two systems, one for each case study, are used to investigate the performance of the proposed methods. The results show that the proposed methods performance under the unknown system model assumption is similar to the known model scenario. The paper proposes a novel approach for handling probabilistic constraints in infinite-time horizon control problems for discrete-time linear Gauss-Markov systems, where risky events are modelled as quadratic costs of the states crossing a user-defined limit. The paper also studies a new reward structure for the case when the system model is unknown. The proposed approach has the potential to be applied in various real-world control problems where probabilistic constraints need to be handled effectively. The paper discusses the use of deep reinforcement learning for continuous control. It compares the performance of different algorithms and finds that the trust region policy optimization algorithm performs the best.
2305.15840.pdf: The paper proposes a parametric estimation method for the impedance of a Li-ion battery, based on a fractional order equivalent circuit model. The method is validated on simulations and applied to measurements of commercial Samsung 48X cells. The parametric estimation algorithm is described in detail. The paper discusses the use of a parametric estimation algorithm to identify the battery impedance of a Li-ion battery. The algorithm uses a multisine excitation signal and a zero mean Gaussian white noise signal to identify the battery impedance. The obtained excitation signal is scaled such that it has the desired RMS value. The nonparametric estimate of the battery impedance at the nonzero excited frequencies is then obtained by a simple division of the discrete Fourier transform (DFT) spectra of the voltage and the current. The DFT of a windowed and sampled signal is defined as the Fourier transform of the signal multiplied by the complex exponential function. The fractional differential equation (FDE) that results from the parametric estimation algorithm is a so-called fractional differential equation (FDE) with a Riemann-Liouville fractional derivative of order α. This paper discusses the estimation of the parameters of a fractional derivative model of a battery impedance. The model is based on the Randles circuit, and the parameters are estimated by minimising the equation error. The estimation is performed using the weighted total least squares method, which is shown to be consistent. Simulation results are provided to illustrate the performance of the estimation method. This paper presents a parametric estimation algorithm for the linear time-invariant impedance of a lithium-ion battery from current and voltage measurements. The algorithm is based on a fractional order model with a Warburg element to model diffusion. The equation error of the corresponding FDE is linear in the parameters, such that its minimisation becomes a TLS estimation problem, which can be solved with the SVD of the regression matrix. Weighting the regression matrix with the variances of the equation error makes the estimation consistent. The algorithm is validated with simulations and measurements on a commercial lithium-ion battery. The relative error between the parametric and nonparametric estimates is largest at the low frequencies. This paper presents a circuit modeling approach for signal design in power sources. The authors describe how to use this approach to optimize the performance of power sources. They also discuss the challenges and limitations of this approach.
2305.16229.pdf: This paper considers a stochastic control framework, where the residual model uncertainty is learned using GP regression. The proposed formulation uses a posterior-GP to approximate the residual model uncertainty and a prior-GP to account for state-dependent noise. The two GPs are interdependent and are thus learned jointly using an iterative algorithm. Theoretical properties of the iterative algorithm are established. Advantages of the proposed state-dependent formulation include (i) faster convergence of the GP estimate to the unknown function as the GP learns which data samples are more trustworthy and (ii) an accurate estimate of state-dependent noise, which can, e.g., be useful for a controller or decision-maker to determine the uncertainty of an action. The paper presents a method for learning an unknown function h(x) with state-dependent variance g(x) using Gaussian processes (GPs). The method uses two GPs: a posterior-GP that uses the data directly to learn h(x), and a prior-GP that uses the variance of the data to learn g(x). The two GPs are interdependent, and the method uses an iterative algorithm to refine both the posterior-GP and the prior-GP. Theoretical properties of the algorithm are presented, including boundedness and convergence under some mild assumptions. The paper proposes a new method for inferring an accurate uncertainty quantification or noise estimate from data. The method is based on training a prior-GP using the measurements, yi, and the mean estimate of the posterior-GP, (x)in (7a). The prior-GP is then used to update the training data for the posterior-GP, Zj+1= (Yj+1(X))22 01 , (10b) which is used to update the posterior-GP, j+1(X) =KX,X KX,X+2 0I1Zj+1, (10c) see Line 8 in Algorithm 1. The algorithm is stopped if a stopping criterion is met. Theorem 1 proves that theAlgorithm 1: Iterative algorithm for GP regression with state-dependent noise 1Initialize 0=0,Z0=0,0(X) =0,j= 0 ; 2do 3 %% update posterior-GP estimate ; 4j+1(X)=KX,X KX,X+2 0I+j(X)1Y; 5 %% update training data for prior-GP using posterior-GP ; 6Zj+1= (Yj+1(X))22 01 ; 7 %% update prior-GP ; 8j+1(X) =KX,X KX,X+2 0I1Zj+1; 9 %% check stopping criterion ; 10 j+1= (KX,X+2 0I+j+1(X))1Y; 11 jj+ 1; 12whilej+1j2; prior GP is contracting, i.e., the prior GP converges to its optimum. This paper proposes a method for Gaussian process regression that takes into account the state of the data points when estimating the mean and variance of the underlying function. The method is validated using two numerical examples, one of which is a stochastic control example. The results show that the proposed method is more accurate than traditional Gaussian process regression methods, especially when the data points are spread out. The paper presents an iterative algorithm for learning a nonlinear function and state-dependent noise distribution using GP regression. The algorithm is shown to converge under simplifying assumptions. Simulation results demonstrate the advantages of using the proposed GP formulation within a stochastic control framework. The paper discusses the use of Gaussian Processes (GPs) for Model Predictive Control (MPC) in autonomous vehicles. The authors compare GP-based MPC to other methods and show that GP-based MPC can outperform these methods in terms of safety and performance. The paper also discusses the challenges associated with implementing GP-based MPC, including the need for careful design of the GP model and the need for computationally efficient methods for solving the optimization problem.
2305.16246.pdf: The paper provides a new non-asymptotic analysis of distributed temporal difference learning with linear function approximation. The approach relies on one-shot averaging, where Nagents run identical local copies of the TD(0) method and average the outcomes only once at the very end. The paper demonstrates a version of the linear time speedup phenomenon, where the convergence time of the distributed process is a factor ofNfaster than the convergence time of TD(0). The paper discusses the use of the TD(0) algorithm for distributed learning in a Markov decision process. The authors assume that the transition matrix is irreducible and aperiodic, and that the feature vectors are linearly independent and uniformly bounded. They show that under these assumptions, the sequence of iterates generated by TD(0) learning converges almost surely to a vector satisfying a certain projected Bellman equation. They then propose a distributed implementation of TD(0) learning, where each agent runs the algorithm locally without any communication, and at the end, the agents simply average the results. The paper analyzes a distributed algorithm for temporal difference learning under the assumption that the tuples in step 4 are i.i.d. The main result is that, when the number of iterations is large enough and the step-size is small enough, the size of the final error will be divided by N. This results in a factor of N speed up of the entire convergence time (when T is large enough). In addition, when the variance of the temporal difference error dominates the convergence rate, the parallelism among N agents shrinks the variance term by a factor of N. The paper provides a proof of a theorem related to the performance of the TD(0) algorithm in a distributed setting. The theorem states that, under certain conditions, the algorithm will converge to the true value function with probability 1. The paper then compares the performance of the algorithm to other distributed TD algorithms in terms of the TD error. The results show that the algorithm performs better than the other algorithms in both the Gridworld and MountainCar-v1 examples. The paper "Distributed TD(0) with Almost No Communication" by Rui Liu and Alex Olshevsky presents a new method for distributed TD learning that requires very little communication between nodes. The authors prove that their method converges in finite time and provides bounds on the convergence rate. They also compare their method to previous methods in the literature and show that it performs similarly despite the reduced communication.