We consider the key challenge of maintenance optimization for asset networks where degradation parameters are heterogeneous and unknown, and must be inferred from real-time degradation data. Combining stochastic modeling and Bayesian statistics, we formulate a partially observable Markov decision process that incorporates estimating shock rates and sizes. This formulation allows us to analytically establish that optimal replacement policies are threshold-type in both degradation levels and parameters. Moreover, we propose an open-loop feedback approach that allows policies trained via deep reinforcement learning (DRL) with access to true parameters to remain effective when deployed using real-time Bayesian point estimates. Complementarily, we develop a Bayesian Markov decision process (BMDP) framework in which the agent maintains and updates posterior distributions during deployment, capturing parameter uncertainty evolution and enabling scalable DRL policies that adapt as new data become available. We evaluate our approaches on synthetic data and a real-world case involving interventional X-ray filaments. The proposed DRL methods consistently outperform traditional heuristics. Policies trained for the BMDP remain robust when priors are estimated from historical data and perform well in highly heterogeneous networks. Access to true parameters provides only marginal cost improvements, highlighting the ability of the approach to make effective decisions under limited information.