PrepSKA WP2 Meeting Software and Computing Duncan Hall 2011-October-19
Imaging context 1 of 2: 2
Imaging context 2 of 2: 3
Agenda: - Progress since 2010 October - CoDR approach and expectations - Presentation from Tim Cornwell - Presentation from Paul Alexander
Progress: SKA Memo 128 Cornwell and Humphreys, 2010 October SKA Exascale Software Challenges HPC Challenges and Lessons Learned from ASKAP: Streaming vs. Batch Processing Software Stack Maturity The Processor-Memory Gap Co-Design in ASKAP and SKA
Progress: SKA Memo 132 Humphreys and Cornwell, 2011 January Analysis of Convolutional Resampling Algorithm Performance Gridding as a significant component of processing Benchmarking: Performance vs. required gridding function size Power considerations Processing and related requirements Analysis of algorithm performance Options for optimisation
Progress: sharing challenges CALIM 2011, hosted by SPDO Presentation slides and Webcam videos available at: http://www2.skatelescope.org/indico/conferencedisplay.py?confid=171 Liaising with partners and potential partners from industry Industry Meeting, Banff 2011 July
Addressing SKA 1 scale complications Reduce data rate offered to HPC Wait for HPC capabilities/cost to catch up Improve efficiency of algorithms Addressed by Tim Cornwell
Progress: DRM 1.3 implications for S&C requirements Addressed by Paul Alexander
Agenda: - Progress since 2010 October - CoDR approach and expectations - Presentation from Tim Cornwell - Presentation from Paul Alexander
Agreed CoDR document set: Initial subsystem requirements High level architecture (incl. options): Hardware Software Risk register (incl. proposed mitigations) Software engineering processes Strategy to proceed to next phase Reference documents
Agenda: - Progress since 2010 October - CoDR approach and expectations - Presentation from Tim Cornwell - Presentation from Paul Alexander
Wide field imaging Quadratic phase term added to Fourier Transform V(u,v,w) = I(l,m)e j 2π w ( 1 l2 m 2 ) 1 1 l 2 m 2 e j 2π ( ul +vm ) dldm Convolution in data space Multiplication in image space Slices in data space SKA WP2 2011 Wednesday, 19 October 11
Wednesday, 19 October 11 R F = Θ 2 FOV Θ res
Snapshot imaging (Bracewell; MWA) Instantaneously arrays are mostly coplanar ( V A ' B = I(l, m)e 2π j ul +vm+w 1 l2 m ) 2 dldm V A ' B = I(l, m)e 2π j u l a 1 l2 m 2 1 l 2 m 2 ( ) ( +v m b 1 l2 m ) 2 ( ) dldm 1 l 2 m 2 Grid on two dimensional plane, FFT, correct for coordinate distortion Gridding cost much lower, but need to correct image every integration Combine snapshot imaging with AWProjection Fit plane to instantaneous u,v,w sampling Use AWProjection to project down onto fitted u,v plane Set threshold in w e.g. 30% of maximum w Refit plane when error in fit exceeds that threshold SKA WP2 2011 w as a function of u,v and hour angle +/- 4 hours T awprojection R F 2 T snapshot R F / T hybrid R F 4 /3 ΔA Wednesday, 19 October 11
Position and width errors in snapshot imaging SKA WP2 2011 Wednesday, 19 October 11
Position and width in w projection/snapshot imaging SKA WP2 2011 Wednesday, 19 October 11
Optimum transition point in hybrid algorithm AW projection Snapshot imaging SKA WP2 2011 Wednesday, 19 October 11
Why do we need hybrid wide field algorithms? Trade off space is large Scientific performance CPU Wall clock time Memory Memory bandwidth No single wide field algorithm will suffice Need very flexible hybrids AW Projection W Stacking Snapshot imaging Faceting Autotuning SKA WP2 2011 Wednesday, 19 October 11
Changes in imaging during scaling work AWProject (2007) Convolution W projection + A projection (for primary beam) Too much CPU Too much memory for convolution function AProjectWStack (2008) Apply W term in image space Much less CPU Too much memory for w stack Convolution/Multiplication AWProject + trimmed convolution function (2009) Only apply and keep non-zero part of convolution function Still too much memory for convolution function AWProject + trimmed convolution function + multiple snapshot planes (2011) Fit and remove w=au+bv plane every 30-60 min Convolution + slices Small memory for convolution function Serialise normal equations piece-by-piece for MPI (2011) Cuts down short bump in memory use Convolution No current algorithm will scale as-is to full-field longer baselines (ASKAP 6km) SKA WP2 2011 Wednesday, 19 October 11
ASKAP imaging scaling curve SKA WP2 2011 Wednesday, 19 October 11
Computing costs Scale with data rate Time averaging and bandwidth smearing vital unknown factor Scale with some power of the Fresnel Scale Optimise algorithms for scaling power and scaling factor Constrained by scientific performance e.g. smearing, astrometry Back of the envelope arguments can be wrong by large factors And omit important effects such as the role of errors Time for careful analysis SKA WP2 2011 Wednesday, 19 October 11
Telescope design DRM Measurement Equation Science requirements Science processing Computing hardware and software Computing costs Computing budget Wednesday, 19 October 11
Software and Computing Example DRM Analysis Paul Alexander Paul Alexander Software and Computing: DRM Analysis
EoR Imaging Illustrate analysis by considering EoR Imaging Analysis is to derive detailed input requirements to the S&C domain Analysis is based on DRM 1.3 some questions already answered Will not, and not yet in a position, to do a detailed analysis of the pipeline required and the full computational and monetary cost Input requirement are: Data rate to process Nature of the input data Data products required Paul Alexander Software and Computing: DRM Analysis
Calculate data rate here from correlator to Injest pipeline System concept for analysis Paul Alexander Software and Computing: DRM Analysis
EoR Imaging DRM Requirements Science Requirements from the DRM Parameter Value Comment Redshift coverage 6 30 Brightness temperature 1 3 mk sensitivity Angular resolution 2 5 Radial resolution 2 Mpc Field of view > 5 deg Set by cosmic variance Paul Alexander Software and Computing: DRM Analysis
EoR Imaging DRM Requirements Technical Requirements from the DRM Parameter Value Comment Frequency range 50-240MHz Critical frequency 100 MHz Frequency resolution 100 khz Bandwidth Df/f ~ 1 RFI excision is critical and may need high resolution ~ 1 khz Cover complete frequency range in each observation Maximum baseline (core) 5km To provide angular resolution Baseline source subtraction Paul Alexander ~200km Integration time >1000 hrs Set by cosmic variance A/T >1000 m 2 K -1 Antenna diameter 7m 30m Core UV coverage N d > 160 Software and Computing: DRM Analysis
Analysis Channel requirements Straight forward 1.7 x 10 5 at 1 khz resolution for RFI excision 1.7 x 10 3 in the final data products Data rate drops by this factor after the injest pipeline Paul Alexander Software and Computing: DRM Analysis
Analysis Sensitivity and Collector distribution Requirement: 10mK in a 5 beam and 3.3mK in a 2 beam From SYS_REQ_1310 the requirement is that A/T = 1000 m 2 K -1 across the 70-450 MHz band of the AA-low. Translated in Memo 130 as a total collecting area of 1.25x10 6 m 2 distributed in 50 180-m stations with a distribution of: Core (r <0.5 km) ~50% (25 stations) 6.25 x 10 5 m 2 f = 0.81 Inner (1< r<2.5 km) ~20% (10 stations) 2.5 x 10 5 m 2 Mid (2.5<r<100 km) ~30% (15 stations) 3.75 x 10 5 m 2 Paul Alexander Software and Computing: DRM Analysis
Sensitivity and Collector distribution Analysis High filling factor in core means flexibility in logical configuration Very important to meet EoR requirement Extensibility to SKA2 gives filling factor ~1 in inner region Resolution: 2 corresponds to ~ 6km at 70MHz 2 corresponds to ~ 2.5 km at 240MHz N.B. would still need beam forming across the full band DRM1.3 matches station diameter to 5 degree FoV giving D = 30m In Inner region: N ~ 1200, but data rate scales as N 2 Adopt instead requirement on UV coverage and take 200 75m stations Beyond 2.5km 85 70m stations or 15 180m stations Paul Alexander Software and Computing: DRM Analysis
Dynamic Range Analysis DRM1.3 gives the flux densities of the faintest EoR structures to be imaged: ~0.3mJy/Beam (1σ) at 100 MHz. N.B. may need to consider more sophisticated definition of dynamic range Jonathan made the point yesterday, source contamination is worse than smooth foregrounds in a 25 sq-degree field an order of magnitude estimate would suggest that we would expect to find a 3C brightness object even by selecting a region with no 3C-like source, consideration of the source counts suggest it seems very likely that the field will still be contaminated by a number of sources with a flux > 1Jy This implies a dynamic range requirement of >65dB. This simple analysis needs to be verified. Paul Alexander Software and Computing: DRM Analysis
Correlator Output Data Rates For imaging, after correlation the data rate is fixed by straightforward considerations Must sample fast enough (limit on integration time) δt Baseline B/λ UV (Fourier) cell size D/λ max Β/λ Β/(λ+δλ) Must have small-enough channel width to avoid chromatic aberration max Ωδt Paul Alexander Software and Computing: DRM Analysis
Correlator Output Data Rates Adopt criteria similar to EVLA but using isotropic smoothing kernal in UV-plane Data rate then given by # antennas # polarizations # beams word-length Can reduce this through the injest pipeline using additional smoothing or baseline-dependent integration times and channel widths Paul Alexander Software and Computing: DRM Analysis
S&C Requirements N D = 200 N ch = 1.7 x 10 5 N b = 16 B max = 5km G out (RFI) = 27.5 GB/s G out = 275 MB/s δt = 18 s N ch = 1.7 x 10 5 AA with 45 degree scan allows 5hr track per day 1000 hr total interation gives 200 days Each observation is 500 TB UV data (1kHz) reducing to 5 TB 150 GB per day of processed data cube at 100 khz channels Do we need to store UV data until complete 1000 hr integration complete? Some analysis approaches will require this N.B. 200 times larger for 30-m logical stations Paul Alexander Software and Computing: DRM Analysis
S&C Requirements long baselines 75-m station G out (RFI) = 2500 GB/s G out = 250 GB/s dt = 0.45 s N ch = 1.9 x 10 4 180-m station G out (RFI) = 187 GB/s G out = 8.9 GB/s dt = 1.08 s N B = 67 N ch = 7.9 x 10 3 Even for 180m station with 200 km baselines full imaging 160 TB of UV data per 5-hr track Image product 16k x 16k x 6k (24 TB per field) with 133 fields For 25 km baseline 2k x 2k x 1k (64 GB) per field 133 fields Precise requirements for the calibration and source subtraction need careful consideration as they could drive requirements for S&C domain and hence SKA Paul Alexander Software and Computing: DRM Analysis
Conclusions Analysis of the DRM naturally raises questions which would not occur to the authors of the science case until the analysis is done Can dominate the requirements Suggest joint teams of Science WG + Domain input to speed up the process of convergence to SRR Paul Alexander Software and Computing: DRM Analysis