美文网首页自然语言处理—学习
今日学术视野(2018.8.21)

今日学术视野(2018.8.21)

作者: ZQtGe6 | 来源:发表于2018-08-21 05:22 被阅读120次

cs.AI - 人工智能
cs.CL - 计算与语言
cs.CR - 加密与安全
cs.CV - 机器视觉与模式识别
cs.CY - 计算与社会
cs.DC - 分布式、并行与集群计算
cs.DS - 数据结构与算法
cs.IR - 信息检索
cs.IT - 信息论
cs.LG - 自动学习
cs.MM - 多媒体
cs.NE - 神经与进化计算
cs.RO - 机器人学
cs.SD - 声音处理
cs.SI - 社交网络与信息网络
eess.AS - 语音处理
eess.IV - 图像与视频处理
math.OC - 优化与控制
math.ST - 统计理论
physics.soc-ph - 物理学与社会
q-bio.OT - 其他定量生物学
q-bio.QM - 定量方法
stat.ME - 统计方法论
stat.ML - (统计)机器学习

• [cs.AI]Neuromorphic Architecture for the Hierarchical Temporal Memory
• [cs.CL]Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words
• [cs.CL]Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study
• [cs.CL]Read + Verify: Machine Reading Comprehension with Unanswerable Questions
• [cs.CL]Story Disambiguation: Tracking Evolving News Stories across News and Social Streams
• [cs.CR]Mitigation of Adversarial Attacks through Embedded Feature Selection
• [cs.CR]Reinforcement Learning for Autonomous Defence in Software-Defined Networking
• [cs.CV]Convolutional Neural Networks based Intra Prediction for HEVC
• [cs.CV]Dynamic Routing on Deep Neural Network for Thoracic Disease Classification and Sensitive Area Localization
• [cs.CV]Efficient Single-Shot Multibox Detector for Construction Site Monitoring
• [cs.CV]Ensemble-based Adaptive Single-shot Multi-box Detector
• [cs.CV]Epithelium segmentation using deep learning in H&E-stained prostate specimens with immunohistochemistry as reference standard
• [cs.CV]Joint Training of Low-Precision Neural Network with Quantization Interval Parameters
• [cs.CV]Medical Image Imputation from Image Collections
• [cs.CV]Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation
• [cs.CV]Performance Analysis and Robustification of Single-query 6-DoF Camera Pose Estimation
• [cs.CV]Whole-Slide Mitosis Detection in H&E Breast Histology Using PHH3 as a Reference to Train Distilled Stain-Invariant Convolutional Networks
• [cs.CY]Embedded EthiCS: Integrating Ethics Broadly Across Computer Science Education
• [cs.DC]Jitter-compensated VHT and its application to WSN clock synchronization
• [cs.DC]Optimal Distributed Weighted Set Cover Approximation
• [cs.DC]Session Guarantees with Raft and Hybrid Logical Clocks
• [cs.DS]Efficiently Learning Mixtures of Mallows Models
• [cs.IR]IceBreaker: Solving Cold Start Problem for Video Recommendation Engines
• [cs.IT]Coordinated Scheduling and Spectrum Sharing via Matrix Fractional Programming
• [cs.IT]Multicast With Prioritized Delivery: How Fresh is Your Data?
• [cs.IT]On the Separability of Ergodic Fading MIMO Channels: A Lattice Coding Approach
• [cs.IT]Semi-Supervised Cluster Extraction via a Compressive Sensing Approach
• [cs.IT]Single-Server Multi-Message Private Information Retrieval with Side Information
• [cs.LG]Data Poisoning Attacks in Contextual Bandits
• [cs.LG]Importance mixing: Improving sample reuse in evolutionary policy search methods
• [cs.LG]Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks
• [cs.LG]Robust Compressive Phase Retrieval via Deep Generative Priors
• [cs.MM]First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App
• [cs.NE]Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization Heuristics: Profiling (1+λ) EA Variants on OneMax and LeadingOnes
• [cs.RO]Towards Robotic Eye Surgery: Marker-free, Online Hand-eye Calibration using Optical Coherence Tomography Images
• [cs.SD]Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model based on BLSTM
• [cs.SI]Bus transport network analysis in Rio de Janeiro based on topological models using Social Networks
• [cs.SI]Characterizing the public perception of WhatsApp through the lens of media
• [eess.AS]Unsupervised adversarial domain adaptation for acoustic scene classification
• [eess.IV]Lifted Wasserstein Matcher for Fast and Robust Topology Tracking
• [math.OC]Decentralized Dictionary Learning Over Time-Varying Digraphs
• [math.ST]Inconsistency of diagonal scaling under high-dimensional limit: a replica approach
• [math.ST]Mixed-Level Column Augmented Uniform Designs
• [math.ST]Modelling Persistence Diagrams with Planar Point Processes, and Revealing Topology with Bagplots
• [math.ST]Non-Asymptotic Behavior of the Maximum Likelihood Estimate of a Discrete Distribution
• [math.ST]Weak convergences of marked empirical processes with applications to goodness-of-fit tests for Markovian processes
• [physics.soc-ph]Efficient sampling of spreading processes on complex networks using a composition and rejection algorithm
• [physics.soc-ph]Large-scale estimation of parking requirements for autonomous mobility on demand systems
• [q-bio.OT]The Function Transformation Omics - Funomics
• [q-bio.QM]Using path signatures to predict a diagnosis of Alzheimer's disease
• [stat.ME]Data Consistency Approach to Model Validation
• [stat.ME]Estimating and accounting for unobserved covariates in high dimensional correlated data
• [stat.ME]Statistical modeling for adaptive trait evolution in randomly evolving environment
• [stat.ML]A bagging and importance sampling approach to Support Vector Machines
• [stat.ML]An N Time-Slice Dynamic Chain Event Graph
• [stat.ML]Correlated Multi-armed Bandits with a Latent Random Source
• [stat.ML]Learning Supervised Topic Models for Classification and Regression from Crowds
• [stat.ML]Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters
• [stat.ML]Randomized Least Squares Regression: Combining Model- and Algorithm-Induced Uncertainties

·····································

• [cs.AI]Neuromorphic Architecture for the Hierarchical Temporal Memory
Abdullah M. Zyarah, Dhireesha Kudithipudi
http://arxiv.org/abs/1808.05839v1

A biomimetic machine intelligence algorithm, that holds promise in creating invariant representations of spatiotemporal input streams is the hierarchical temporal memory (HTM). This unsupervised online algorithm has been demonstrated on several machine-learning tasks, including anomaly detection. Significant effort has been made in formalizing and applying the HTM algorithm to different classes of problems. There are few early explorations of the HTM hardware architecture, especially for the earlier version of the spatial pooler of HTM algorithm. In this article, we present a full-scale HTM architecture for both spatial pooler and temporal memory. Synthetic synapse design is proposed to address the potential and dynamic interconnections occurring during learning. The architecture is interweaved with parallel cells and columns that enable high processing speed for the HTM. The proposed architecture is verified for two different datasets: MNIST and the European number plate font (EUNF), with and without the presence of noise. The spatial pooler architecture is synthesized on Xilinx ZYNQ-7, with 91.16% classification accuracy for MNIST and 90% accuracy for EUNF, with noise. For the temporal memory sequence prediction, first and second order predictions are observed for a 5-number long sequence generated from EUNF dataset and 95% accuracy is obtained. Moreover, the proposed hardware architecture offers 1364X speedup over the software realization. These results indicate that the proposed architecture can serve as a digital core to build the HTM in hardware and eventually as a standalone self-learning system.

• [cs.CL]Augmenting Statistical Machine Translation with Subword Translation of Out-of-Vocabulary Words
Nelson F. Liu, Jonathan May, Michael Pust, Kevin Knight
http://arxiv.org/abs/1808.05700v1

Most statistical machine translation systems cannot translate words that are unseen in the training data. However, humans can translate many classes of out-of-vocabulary (OOV) words (e.g., novel morphological variants, misspellings, and compounds) without context by using orthographic clues. Following this observation, we describe and evaluate several general methods for OOV translation that use only subword information. We pose the OOV translation problem as a standalone task and intrinsically evaluate our approaches on fourteen typologically diverse languages across varying resource levels. Adding OOV translators to a statistical machine translation system yields consistent BLEU gains (0.5 points on average, and up to 2.0) for all fourteen languages, especially in low-resource scenarios.

• [cs.CL]Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study
Aditya Siddhant, Zachary C. Lipton
http://arxiv.org/abs/1808.05697v1

Several recent papers investigate Active Learning (AL) for mitigating the data dependence of deep learning for natural language processing. However, the applicability of AL to real-world problems remains an open question. While in supervised learning, practitioners can try many different methods, evaluating each against a validation set before selecting a model, AL affords no such luxury. Over the course of one AL run, an agent annotates its dataset exhausting its labeling budget. Thus, given a new task, an active learner has no opportunity to compare models and acquisition functions. This paper provides a large scale empirical study of deep active learning, addressing multiple tasks and, for each, multiple datasets, multiple models, and a full suite of acquisition functions. We find that across all settings, Bayesian active learning by disagreement, using uncertainty estimates provided either by Dropout or Bayes-by Backprop significantly improves over i.i.d. baselines and usually outperforms classic uncertainty sampling.

• [cs.CL]Read + Verify: Machine Reading Comprehension with Unanswerable Questions
Minghao Hu, Furu wei, Yuxing Peng, Zhen Huang, Nan Yang, Ming Zhou
http://arxiv.org/abs/1808.05759v1

Machine reading comprehension with unanswerable questions aims to abstain from answering when no answer can be inferred. Previous works using an additional no-answer option attempt to extract answers and detect unanswerable questions simultaneously, but they have struggled to produce high-quality answers and often misclassify questions. In this paper, we propose a novel read-then-verify system that combines a base neural reader with a sentence-level answer verifier trained to further validate if the predicted answer is entailed by input snippets. Moreover, we augment the base reader with two auxiliary losses to better handle answer extraction and no-answer detection respectively, and investigate three different architectures for the answer verifier. Our experiments on the SQuAD 2.0 dataset show that our system can achieve a score of 74.8 F1 on the development set, outperforming the previous best published model by more than 7 points.

• [cs.CL]Story Disambiguation: Tracking Evolving News Stories across News and Social Streams
Bichen Shi, Thanh-Binh Le, Neil Hurley, Georgiana Ifrim
http://arxiv.org/abs/1808.05906v1

Following a particular news story online is an important but difficult task, as the relevant information is often scattered across different domains/sources (e.g., news articles, blogs, comments, tweets), presented in various formats and language styles, and may overlap with thousands of other stories. In this work we join the areas of topic tracking and entity disambiguation, and propose a framework named Story Disambiguation - a cross-domain story tracking approach that builds on real-time entity disambiguation and a learning-to-rank framework to represent and update the rich semantic structure of news stories. Given a target news story, specified by a seed set of documents, the goal is to effectively select new story-relevant documents from an incoming document stream. We represent stories as entity graphs and we model the story tracking problem as a learning-to-rank task. This enables us to track content with high accuracy, from multiple domains, in real-time. We study a range of text, entity and graph based features to understand which type of features are most effective for representing stories. We further propose new semi-supervised learning techniques to automatically update the story representation over time. Our empirical study shows that we outperform the accuracy of state-of-the-art methods for tracking mixed-domain document streams, while requiring fewer labeled data to seed the tracked stories. This is particularly the case for local news stories that are easily over shadowed by other trending stories, and for complex news stories with ambiguous content in noisy stream environments.

• [cs.CR]Mitigation of Adversarial Attacks through Embedded Feature Selection
Ziyi Bao, Luis Muñoz-González, Emil C. Lupu
http://arxiv.org/abs/1808.05705v1

Machine learning has become one of the main components for task automation in many application domains. Despite the advancements and impressive achievements of machine learning, it has been shown that learning algorithms can be compromised by attackers both at training and test time. Machine learning systems are especially vulnerable to adversarial examples where small perturbations added to the original data points can produce incorrect or unexpected outputs in the learning algorithms at test time. Mitigation of these attacks is hard as adversarial examples are difficult to detect. Existing related work states that the security of machine learning systems against adversarial examples can be weakened when feature selection is applied to reduce the systems' complexity. In this paper, we empirically disprove this idea, showing that the relative distortion that the attacker has to introduce to succeed in the attack is greater when the target is using a reduced set of features. We also show that the minimal adversarial examples differ statistically more strongly from genuine examples with a lower number of features. However, reducing the feature count can negatively impact the system's performance. We illustrate the trade-off between security and accuracy with specific examples. We propose a design methodology to evaluate the security of machine learning classifiers with embedded feature selection against adversarial examples crafted using different attack strategies.

• [cs.CR]Reinforcement Learning for Autonomous Defence in Software-Defined Networking
Yi Han, Benjamin I. P. Rubinstein, Tamas Abraham, Tansu Alpcan, Olivier De Vel, Sarah Erfani, David Hubczenko, Christopher Leckie, Paul Montague
http://arxiv.org/abs/1808.05770v1

Despite the successful application of machine learning (ML) in a wide range of domains, adaptability---the very property that makes machine learning desirable---can be exploited by adversaries to contaminate training and evade classification. In this paper, we investigate the feasibility of applying a specific class of machine learning algorithms, namely, reinforcement learning (RL) algorithms, for autonomous cyber defence in software-defined networking (SDN). In particular, we focus on how an RL agent reacts towards different forms of causative attacks that poison its training process, including indiscriminate and targeted, white-box and black-box attacks. In addition, we also study the impact of the attack timing, and explore potential countermeasures such as adversarial training.

• [cs.CV]Convolutional Neural Networks based Intra Prediction for HEVC
Wenxue Cui, Tao Zhang, Shengping Zhang, Feng Jiang, Wangmeng Zuo, Debin Zhao
http://arxiv.org/abs/1808.05734v1

Traditional intra prediction methods for HEVC rely on using the nearest reference lines for predicting a block, which ignore much richer context between the current block and its neighboring blocks and therefore cause inaccurate prediction especially when weak spatial correlation exists between the current block and the reference lines. To overcome this problem, in this paper, an intra prediction convolutional neural network (IPCNN) is proposed for intra prediction, which exploits the rich context of the current block and therefore is capable of improving the accuracy of predicting the current block. Meanwhile, the predictions of the three nearest blocks can also be refined. To the best of our knowledge, this is the first paper that directly applies CNNs to intra prediction for HEVC. Experimental results validate the effectiveness of applying CNNs to intra prediction and achieved significant performance improvement compared to traditional intra prediction methods.

• [cs.CV]Dynamic Routing on Deep Neural Network for Thoracic Disease Classification and Sensitive Area Localization
Yan Shen, Mingchen Gao
http://arxiv.org/abs/1808.05744v1

We present and evaluate a new deep neural network architecture for automatic thoracic disease detection on chest X-rays. Deep neural networks have shown great success in a plethora of visual recognition tasks such as image classification and object detection by stacking multiple layers of convolutional neural networks (CNN) in a feed-forward manner. However, the performance gain by going deeper has reached bottlenecks as a result of the trade-off between model complexity and discrimination power. We address this problem by utilizing the recently developed routing-by agreement mechanism in our architecture. A novel characteristic of our network structure is that it extends routing to two types of layer connections (1) connection between feature maps in dense layers, (2) connection between primary capsules and prediction capsules in final classification layer. We show that our networks achieve comparable results with much fewer layers in the measurement of AUC score. We further show the combined benefits of model interpretability by generating Gradient-weighted Class Activation Mapping (Grad-CAM) for localization. We demonstrate our results on the NIH chestX-ray14 dataset that consists of 112,120 images on 30,805 unique patients including 14 kinds of lung diseases.

• [cs.CV]Efficient Single-Shot Multibox Detector for Construction Site Monitoring
Viral Thakar, Himani Saini, Walid Ahmed, Mohammad M Soltani, Ahmed Aly, Jia Yuan Yu
http://arxiv.org/abs/1808.05730v1

Asset monitoring in construction sites is an intricate, manually intensive task, that can highly benefit from automated solutions engineered using deep neural networks. We use Single-Shot Multibox Detector --- SSD, for its fine balance between speed and accuracy, to leverage ubiquitously available images and videos from the surveillance cameras on the construction sites and automate the monitoring tasks, hence enabling project managers to better track the performance and optimize the utilization of each resource. We propose to improve the performance of SSD by clustering the predicted boxes instead of a greedy approach like non-maximum suppression. We do so using Affinity Propagation Clustering --- APC to cluster the predicted boxes based on the similarity index computed using the spatial features as well as location of predicted boxes. In our attempts, we have been able to improve the mean average precision of SSD by 3.77% on custom dataset consist of images from construction sites and by 1.67% on PASCAL VOC Challenge.

• [cs.CV]Ensemble-based Adaptive Single-shot Multi-box Detector
Viral Thakar, Walid Ahmed, Mohammad M Soltani, Jia Yuan Yu
http://arxiv.org/abs/1808.05727v1

We propose two improvements to the SSD---single shot multibox detector. First, we propose an adaptive approach for default box selection in SSD. This uses data to reduce the uncertainty in the selection of best aspect ratios for the default boxes and improves performance of SSD for datasets containing small and complex objects (e.g., equipments at construction sites). We do so by finding the distribution of aspect ratios of the given training dataset, and then choosing representative values. Secondly, we propose an ensemble algorithm, using SSD as components, which improves the performance of SSD, especially for small amount of training datasets. Compared to the conventional SSD algorithm, adaptive box selection improves mean average precision by 3%, while ensemble-based SSD improves it by 8%.

• [cs.CV]Epithelium segmentation using deep learning in H&E-stained prostate specimens with immunohistochemistry as reference standard
Wouter Bulten, Péter Bándi, Jeffrey Hoven, Rob van de Loo, Johannes Lotz, Nick Weiss, Jeroen van der Laak, Bram van Ginneken, Christina Hulsbergen-van de Kaa, Geert Litjens
http://arxiv.org/abs/1808.05883v1

Prostate cancer (PCa) is graded by pathologists by examining the architectural pattern of cancerous epithelial tissue on hematoxylin and eosin (H&E) stained slides. Given the importance of gland morphology, automatically differentiating between glandular epithelial tissue and other tissues is an important prerequisite for the development of automated methods for detecting PCa. We propose a new method, using deep learning, for automatically segmenting epithelial tissue in digitized prostatectomy slides. We employed immunohistochemistry (IHC) to render the ground truth less subjective and more precise compared to manual outlining on H&E slides, especially in areas with high-grade and poorly differentiated PCa. Our dataset consisted of 102 tissue blocks, including both low and high grade PCa. From each block a single new section was cut, stained with H&E, scanned, restained using P63 and CK8/18 to highlight the epithelial structure, and scanned again. The H&E slides were co-registered to the IHC slides. On a subset of the IHC slides we applied color deconvolution, corrected stain errors manually, and trained a U-Net to perform segmentation of epithelial structures. Whole-slide segmentation masks generated by the IHC U-Net were used to train a second U-Net on H&E. Our system makes precise cell-level segmentations and segments both intact glands as well as individual (tumor) epithelial cells. We achieved an F1-score of 0.895 on a hold-out test set and 0.827 on an external reference set from a different center. We envision this segmentation as being the first part of a fully automated prostate cancer detection and grading pipeline.

• [cs.CV]Joint Training of Low-Precision Neural Network with Quantization Interval Parameters
Sangil Jung, Changyong Son, Seohyung Lee, Jinwoo Son, Youngjun Kwak, Jae-Joon Han, Changkyu Choi
http://arxiv.org/abs/1808.05779v1

Optimization for low-precision neural network is an important technique for deep convolutional neural network models to be deployed to mobile devices. In order to realize convolutional layers with the simple bit-wise operations, both activation and weight parameters need to be quantized with a low bit-precision. In this paper, we propose a novel optimization method for low-precision neural network which trains both activation quantization parameters and the quantized model weights. We parameterize the quantization intervals of the weights and the activations and train the parameters with the full-precision weights by directly minimizing the training loss rather than minimizing the quantization error. Thanks to the joint optimization of quantization parameters and model weights, we obtain the highly accurate low-precision network given a target bitwidth. We demonstrated the effectiveness of our method on two benchmarks: CIFAR-10 and ImageNet.

• [cs.CV]Medical Image Imputation from Image Collections
Adrian V. Dalca, Katherine L. Bouman, William T. Freeman, Natalia S. Rost, Mert R. Sabuncu, Polina Golland
http://arxiv.org/abs/1808.05732v1

We present an algorithm for creating high resolution anatomically plausible images consistent with acquired clinical brain MRI scans with large inter-slice spacing. Although large data sets of clinical images contain a wealth of information, time constraints during acquisition result in sparse scans that fail to capture much of the anatomy. These characteristics often render computational analysis impractical as many image analysis algorithms tend to fail when applied to such images. Highly specialized algorithms that explicitly handle sparse slice spacing do not generalize well across problem domains. In contrast, we aim to enable application of existing algorithms that were originally developed for high resolution research scans to significantly undersampled scans. We introduce a generative model that captures fine-scale anatomical structure across subjects in clinical image collections and derive an algorithm for filling in the missing data in scans with large inter-slice spacing. Our experimental results demonstrate that the resulting method outperforms state-of-the-art upsampling super-resolution techniques, and promises to facilitate subsequent analysis not previously possible with scans of this quality. Our implementation is freely available at https://github.com/adalca/papago .

• [cs.CV]Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation
Mohamed Omran, Christoph Lassner, Gerard Pons-Moll, Peter V. Gehler, Bernt Schiele
http://arxiv.org/abs/1808.05942v1

Direct prediction of 3D body pose and shape remains a challenge even for highly parameterized deep learning models. Mapping from the 2D image space to the prediction space is difficult: perspective ambiguities make the loss function noisy and training data is scarce. In this paper, we propose a novel approach (Neural Body Fitting (NBF)). It integrates a statistical body model within a CNN, leveraging reliable bottom-up semantic body part segmentation and robust top-down body model constraints. NBF is fully differentiable and can be trained using 2D and 3D annotations. In detailed experiments, we analyze how the components of our model affect performance, especially the use of part segmentations as an explicit intermediate representation, and present a robust, efficiently trainable framework for 3D human pose estimation from 2D images with competitive results on standard benchmarks. Code will be made available at http://github.com/mohomran/neural_body_fitting

• [cs.CV]Performance Analysis and Robustification of Single-query 6-DoF Camera Pose Estimation
Junsheng Fu, Said Pertuz, Jiri Matas, Joni-Kristian Kämäräinen
http://arxiv.org/abs/1808.05848v1

We consider a single-query 6-DoF camera pose estimation with reference images and a point cloud, i.e. the problem of estimating the position and orientation of a camera by using reference images and a point cloud. In this work, we perform a systematic comparison of three state-of-the-art strategies for 6-DoF camera pose estimation, i.e. feature-based, photometric-based and mutual-information-based approaches. The performance of the studied methods is evaluated on two standard datasets in terms of success rate, translation error and max orientation error. Building on the results analysis, we propose a hybrid approach that combines feature-based and mutual-information-based pose estimation methods since it provides complementary properties for pose estimation. Experiments show that (1) in cases with large environmental variance, the hybrid approach outperforms feature-based and mutual-information-based approaches by an average of 25.1% and 5.8% in terms of success rate, respectively; (2) in cases where query and reference images are captured at similar imaging conditions, the hybrid approach performs similarly as the feature-based approach, but outperforms both photometric-based and mutual-information-based approaches with a clear margin; (3) the feature-based approach is consistently more accurate than mutual-information-based and photometric-based approaches when at least 4 consistent matching points are found between the query and reference images.

• [cs.CV]Whole-Slide Mitosis Detection in H&E Breast Histology Using PHH3 as a Reference to Train Distilled Stain-Invariant Convolutional Networks
David Tellez, Maschenka Balkenhol, Irene Otte-Holler, Rob van de Loo, Rob Vogels, Peter Bult, Carla Wauters, Willem Vreuls, Suzanne Mol, Nico Karssemeijer, Geert Litjens, Jeroen van der Laak, Francesco Ciompi
http://arxiv.org/abs/1808.05896v1

Manual counting of mitotic tumor cells in tissue sections constitutes one of the strongest prognostic markers for breast cancer. This procedure, however, is time-consuming and error-prone. We developed a method to automatically detect mitotic figures in breast cancer tissue sections based on convolutional neural networks (CNNs). Application of CNNs to hematoxylin and eosin (H&E) stained histological tissue sections is hampered by: (1) noisy and expensive reference standards established by pathologists, (2) lack of generalization due to staining variation across laboratories, and (3) high computational requirements needed to process gigapixel whole-slide images (WSIs). In this paper, we present a method to train and evaluate CNNs to specifically solve these issues in the context of mitosis detection in breast cancer WSIs. First, by combining image analysis of mitotic activity in phosphohistone-H3 (PHH3) restained slides and registration, we built a reference standard for mitosis detection in entire H&E WSIs requiring minimal manual annotation effort. Second, we designed a data augmentation strategy that creates diverse and realistic H&E stain variations by modifying the hematoxylin and eosin color channels directly. Using it during training combined with network ensembling resulted in a stain invariant mitosis detector. Third, we applied knowledge distillation to reduce the computational requirements of the mitosis detection ensemble with a negligible loss of performance. The system was trained in a single-center cohort and evaluated in an independent multicenter cohort from The Cancer Genome Atlas on the three tasks of the Tumor Proliferation Assessment Challenge (TUPAC). We obtained a performance within the top-3 best methods for most of the tasks of the challenge.

• [cs.CY]Embedded EthiCS: Integrating Ethics Broadly Across Computer Science Education
Barbara J. Grosz, David Gray Grant, Kate Vredenburgh, Jeff Behrends, Lily Hu, Alison Simmons, Jim Waldo
http://arxiv.org/abs/1808.05686v1

Computing technologies have become pervasive in daily life, sometimes bringing unintended but harmful consequences. For students to learn to think not only about what technology they could create, but also about what technology they should create, computer science curricula must expand to include ethical reasoning about the societal value and impact of these technologies. This paper presents Embedded EthiCS, a novel approach to integrating ethics into computer science education that incorporates ethical reasoning throughout courses in the standard computer science curriculum. It thus changes existing courses rather than requiring wholly new courses. The paper describes a pilot Embedded EthiCS program that embeds philosophers teaching ethical reasoning directly into computer science courses. It discusses lessons learned and challenges to implementing such a program across different types of academic institutions.

• [cs.DC]Jitter-compensated VHT and its application to WSN clock synchronization
Federico Terraneo, Fabiano Riccardi, Alberto Leva
http://arxiv.org/abs/1808.05696v1

Accurate and energy-efficient clock synchronization is an enabler for many applications of Wireless Sensor Networks. A fine-grained synchronization is beneficial both at the system level, for example to favor deterministic radio protocols, and at the application level, when network-wide event timestamping is required. However, there is a tradeoff between the resolution of a WSN node's timekeeping device and its energy consumption. The Virtual High-resolution Timer (VHT) is an innovative solution, that was proposed to overcome this tradeoff. It combines a high-resolution oscillator to a low-power one, turning off the former when not needed. In this paper we improve VHT by first identifying the jitter of the low-power oscillator as the current limit to the technique, and then proposing an enhanced solution that synchronizes the fast and the slow clock, rejecting the said jitter. The improved VHT is also less demanding than the original technique in terms of hardware resources. Experimental results show the achieved advantages in terms of accuracy.

• [cs.DC]Optimal Distributed Weighted Set Cover Approximation
Ran Ben-Basat, Guy Even, Ken-ichi Kawarabayashi, Gregory Schwartzman
http://arxiv.org/abs/1808.05809v1

We present a time-optimal deterministic distributed algorithm for approximating a minimum weight vertex cover in hypergraphs of rank f. This problem is equivalent to the Minimum Weight Set Cover Problem in which the frequency of every element is bounded by f. The approximation factor of our algorithm is (f+\epsilon). Let \Delta denote the maximum degree in the hypergraph. Our algorithm runs in the CONGEST model and requires O(\log{\Delta} / \log \log \Delta) rounds, for constants \epsilon \in (0,1] and f\in\mathbb N^+. This is the first distributed algorithm for this problem whose running time does not depend on the vertex weights or the number of vertices. Thus adding another member to the exclusive family of \emph{provably optimal} distributed algorithms. For constant values of f and \epsilon, our algorithm improves over the (f+\epsilon)-approximation algorithm of \cite{KuhnMW06} whose running time is O(\log \Delta + \log W), where W is the ratio between the largest and smallest vertex weights in the graph.

• [cs.DC]Session Guarantees with Raft and Hybrid Logical Clocks
Mohammad Roohitavaf, Jung-Sang Ahn, Woon-Hak Kang, Kun Ren, Gene Zhang, Sami Ben-Romdhane, Sandeep S. Kulkarni
http://arxiv.org/abs/1808.05698v1

Eventual consistency is a popular consistency model for geo-replicated data stores. Although eventual consistency provides high performance and availability, it can cause anomalies that make programming complex for application developers. Session guarantees can remove some of these anomalies while causing much lower overhead compared with stronger consistency models. In this paper, we provide a protocol for providing session guarantees for NuKV, a key-value store developed for services with very high availability and performance requirements at eBay. NuKV relies on the Raft protocol for replication inside datacenters, and uses eventual consistency for replication among datacenters. We provide modified versions of conventional session guarantees to avoid the problem of slowdown cascades in systems with large numbers of partitions. We also use Hybrid Logical Clocks to eliminate the need for delaying write operations to satisfy session guarantees. Our experiments show that our protocol provides session guarantees with a negligible overhead when compared with eventual consistency.

• [cs.DS]Efficiently Learning Mixtures of Mallows Models
Allen Liu, Ankur Moitra
http://arxiv.org/abs/1808.05731v1

Mixtures of Mallows models are a popular generative model for ranking data coming from a heterogeneous population. They have a variety of applications including social choice, recommendation systems and natural language processing. Here we give the first polynomial time algorithm for provably learning the parameters of a mixture of Mallows models with any constant number of components. Prior to our work, only the two component case had been settled. Our analysis revolves around a determinantal identity of Zagier which was proven in the context of mathematical physics, which we use to show polynomial identifiability and ultimately to construct test functions to peel off one component at a time. To complement our upper bounds, we show information-theoretic lower bounds on the sample complexity as well as lower bounds against restricted families of algorithms that make only local queries. Together, these results demonstrate various impediments to improving the dependence on the number of components. They also motivate the study of learning mixtures of Mallows models from the perspective of beyond worst-case analysis. In this direction, we show that when the scaling parameters of the Mallows models have separation, there are much faster learning algorithms.

• [cs.IR]IceBreaker: Solving Cold Start Problem for Video Recommendation Engines
Yaman Kumar, Agniv Sharma, Abhigyan Khaund, Akash Kumar, Ponnurangam Kumaraguru, Rajiv Ratn Shah
http://arxiv.org/abs/1808.05636v1

Internet has brought about a tremendous increase in content of all forms and, in that, video content constitutes the major backbone of the total content being published as well as watched. Thus it becomes imperative for video recommendation engines such as Hulu to look for novel and innovative ways to recommend the newly added videos to their users. However, the problem with new videos is that they lack any sort of metadata and user interaction so as to be able to rate the videos for the consumers. To this effect, this paper introduces the several techniques we develop for the Content Based Video Relevance Prediction (CBVRP) Challenge being hosted by Hulu for the ACM Multimedia Conference 2018. We employ different architectures on the CBVRP dataset to make use of the provided frame and video level features and generate predictions of videos that are similar to the other videos. We also implement several ensemble strategies to explore complementarity between both the types of provided features. The obtained results are encouraging and will impel the boundaries of research for multimedia based video recommendation systems.

• [cs.IT]Coordinated Scheduling and Spectrum Sharing via Matrix Fractional Programming
Kaiming Shen, Wei Yu, Licheng Zhao, Daniel P. Palomar
http://arxiv.org/abs/1808.05678v1

Interference management is a fundamental issue in device-to-device (D2D) communications whenever the transmitter-and-receiver pairs are located geographically in close proximity and frequencies are fully reused, so active links may severely interfere with each other. This paper devises an optimization strategy named FPLinQ to coordinate the link scheduling decisions among the interfering links, along with power control and beamforming. The key enabler is a novel optimization method called matrix fractional programming (FP) that generalizes previous scalar and vector forms of FP in allowing multiple data streams per link. From a theoretical perspective, this paper provides a deeper understanding of FP by showing a connection to the minorization-maximization (MM) algorithm. From an application perspective, this paper shows that as compared to the existing methods for coordinating scheduling in the D2D network, such as FlashLinQ, ITLinQ, and ITLinQ+, the proposed FPLinQ approach is more general in allowing multiple antennas at both the transmitters and the receivers, and further in allowing arbitrary and multiple possible associations between the devices via matching. Numerical results show that FPLinQ significantly outperforms the previous state-of-the-art in a typical D2D communication environment.

• [cs.IT]Multicast With Prioritized Delivery: How Fresh is Your Data?
Jing Zhong, Roy D. Yates, Emina Soljanin
http://arxiv.org/abs/1808.05738v1

We consider a multicast network in which real-time status updates generated by a source are replicated and sent to multiple interested receiving nodes through independent links. The receiving nodes are divided into two groups: one priority group consists of k nodes that require the reception of every update packet, the other non-priority group consists of all other nodes without the delivery requirement. Using age of information as a freshness metric, we analyze the time-averaged age at both priority and non-priority nodes. For shifted-exponential link delay distributions, the average age at a priority node is lower than that at a non-priority node due to the delivery guarantee. However, this advantage for priority nodes disappears if the link delay is exponential distributed. Both groups of nodes have the same time-averaged age, which implies that the guaranteed delivery of updates has no effect the time-averaged freshness.

• [cs.IT]On the Separability of Ergodic Fading MIMO Channels: A Lattice Coding Approach
Ahmed Hindy, Aria Nosratinia
http://arxiv.org/abs/1808.05922v1

This paper addresses point-to-point communication over block-fading channels with independent fading blocks. When both channel state information at the transmitter (CSIT) and receiver (CSIR) are available, most achievable schemes use separable coding, i.e., coding independently and in parallel over different fading states. Unfortunately, separable coding has drawbacks including large memory requirements at both communication ends. In this paper a lattice coding and decoding scheme is proposed that achieves the ergodic capacity without separable coding, with lattice codebooks and decoding decision regions that are universal across channel realizations. We first demonstrate this result for fading distributions with discrete, finite support whose sequences are robustly typical. Results are then extended to continuous fading distributions, as well as multiple-input multiple-output (MIMO) systems. In addition, a variant of the proposed scheme is presented for the MIMO ergodic fading channel with CSIR only, where we prove the existence of a universal codebook that achieves rates within a constant gap to capacity for finite-support fading distributions. The gap is small compared with other schemes in the literature. Extension to continuous-valued fading is also provided.

• [cs.IT]Semi-Supervised Cluster Extraction via a Compressive Sensing Approach
Ming-Jun Lai, Daniel Mckenzie
http://arxiv.org/abs/1808.05780v1

We use techniques from compressive sensing to design a local clustering algorithm by treating the cluster indicator vector as a sparse solution to a linear system whose coefficient matrix is the graph Laplacian. If the graph is drawn from the Stochastic Block Model we are able to prove that the fraction of misclassified vertices goes to zero as the size of the graph increases. Numerical experiments on simulated and real-life graphs demonstrate the effectiveness and speed of our approach. Finally, we explore the application of our algorithm to semi-supervised learning.

• [cs.IT]Single-Server Multi-Message Private Information Retrieval with Side Information
Su Li, Michael Gastpar
http://arxiv.org/abs/1808.05797v1

We study the problem of single-server multi-message private information retrieval with side information. One user wants to recover N out of K independent messages which are stored at a single server. The user initially possesses a subset of M messages as side information. The goal of the user is to download the N demand messages while not leaking any information about the indices of these messages to the server. In this paper, we characterize the minimum number of required transmissions. We also present the optimal linear coding scheme which enables the user to download the demand messages and preserves the privacy of their indices. Moreover, we show that the trivial MDS coding scheme with K-M transmissions is optimal if N>M or N^2+N \ge K-M. This means if one wishes to privately download more than the square-root of the number of files in the database, then one must effectively download the full database (minus the side information), irrespective of the amount of side information one has available.

• [cs.LG]Data Poisoning Attacks in Contextual Bandits
Yuzhe Ma, Kwang-Sung Jun, Lihong Li, Xiaojin Zhu
http://arxiv.org/abs/1808.05760v1

We study offline data poisoning attacks in contextual bandits, a class of reinforcement learning problems with important applications in online recommendation and adaptive medical treatment, among others. We provide a general attack framework based on convex optimization and show that by slightly manipulating rewards in the data, an attacker can force the bandit algorithm to pull a target arm for a target contextual vector. The target arm and target contextual vector are both chosen by the attacker. That is, the attacker can hijack the behavior of a contextual bandit. We also investigate the feasibility and the side effects of such attacks, and identify future directions for defense. Experiments on both synthetic and real-world data demonstrate the efficiency of the attack algorithm.

• [cs.LG]Importance mixing: Improving sample reuse in evolutionary policy search methods
Aloïs Pourchot, Nicolas Perrin, Olivier Sigaud
http://arxiv.org/abs/1808.05832v1

Deep neuroevolution, that is evolutionary policy search methods based on deep neural networks, have recently emerged as a competitor to deep reinforcement learning algorithms due to their better parallelization capabilities. However, these methods still suffer from a far worse sample efficiency. In this paper we investigate whether a mechanism known as "importance mixing" can significantly improve their sample efficiency. We provide a didactic presentation of importance mixing and we explain how it can be extended to reuse more samples. Then, from an empirical comparison based on a simple benchmark, we show that, though it actually provides better sample efficiency, it is still far from the sample efficiency of deep reinforcement learning, though it is more stable.

• [cs.LG]Motion Prediction of Traffic Actors for Autonomous Driving using Deep Convolutional Networks
Nemanja Djuric, Vladan Radosavljevic, Henggang Cui, Thi Nguyen, Fang-Chieh Chou, Tsung-Han Lin, Jeff Schneider
http://arxiv.org/abs/1808.05819v1

Recent algorithmic improvements and hardware breakthroughs resulted in a number of success stories in the field of AI impacting our daily lives. However, despite its ubiquity AI is only just starting to make advances in what may arguably have the largest impact thus far, the nascent field of autonomous driving. In this work we discuss this important topic and address one of crucial aspects of the emerging area, the problem of predicting future state of autonomous vehicle's surrounding necessary for safe and efficient operations. We introduce a deep learning-based approach that takes into account current state of traffic actors and produces rasterized representations of each actor's vicinity. The raster images are then used by deep convolutional models to infer future movement of actors while accounting for inherent uncertainty of the prediction task. Extensive experiments on real-world data strongly suggest benefits of the proposed approach. Moreover, following successful tests the system was deployed to a fleet of autonomous vehicles.

• [cs.LG]Robust Compressive Phase Retrieval via Deep Generative Priors
Fahad Shamshad, Ali Ahmed
http://arxiv.org/abs/1808.05854v1

This paper proposes a new framework to regularize the highly ill-posed and non-linear phase retrieval problem through deep generative priors using simple gradient descent algorithm. We experimentally show effectiveness of proposed algorithm for random Gaussian measurements (practically relevant in imaging through scattering media) and Fourier friendly measurements (relevant in optical set ups). We demonstrate that proposed approach achieves impressive results when compared with traditional hand engineered priors including sparsity and denoising frameworks for number of measurements and robustness against noise. Finally, we show the effectiveness of the proposed approach on a real transmission matrix dataset in an actual application of multiple scattering media imaging.

• [cs.MM]First Steps Toward CNN based Source Classification of Document Images Shared Over Messaging App
Sharad Joshi, Suraj Saxena, Nitin Khanna
http://arxiv.org/abs/1808.05941v1

Knowledge of source smartphone corresponding to a document image can be helpful in a variety of applications including copyright infringement, ownership attribution, leak identification and usage restriction. In this letter, we investigate a convolutional neural network-based approach to solve source smartphone identification problem for printed text documents which have been captured by smartphone cameras and shared over messaging platform. In absence of any publicly available dataset addressing this problem, we introduce a new image dataset consisting of 315 images of documents printed in three different fonts, captured using 21 smartphones and shared over WhatsApp. Experiments conducted on this dataset demonstrate that, in all scenarios, the proposed system performs as well as or better than the state-of-the-art system based on handcrafted features and classification of letters extracted from document images. The new dataset and code of the proposed system will be made publicly available along with this letter's publication, presently they are submitted for review.

• [cs.NE]Towards a Theory-Guided Benchmarking Suite for Discrete Black-Box Optimization Heuristics: Profiling (1+λ) EA Variants on OneMax and LeadingOnes
Carola Doerr, Furong Ye, Sander van Rijn, Hao Wang, Thomas Bäck
http://arxiv.org/abs/1808.05850v1

Theoretical and empirical research on evolutionary computation methods complement each other by providing two fundamentally different approaches towards a better understanding of black-box optimization heuristics. In discrete optimization, both streams developed rather independently of each other, but we observe today an increasing interest in reconciling these two sub-branches. In continuous optimization, the COCO (COmparing Continuous Optimisers) benchmarking suite has established itself as an important platform that theoreticians and practitioners use to exchange research ideas and questions. No widely accepted equivalent exists in the research domain of discrete black-box optimization. Marking an important step towards filling this gap, we adjust the COCO software to pseudo-Boolean optimization problems, and obtain from this a benchmarking environment that allows a fine-grained empirical analysis of discrete black-box heuristics. In this documentation we demonstrate how this test bed can be used to profile the performance of evolutionary algorithms. More concretely, we study the optimization behavior of several (1+\lambda) EA variants on the two benchmark problems OneMax and LeadingOnes. This comparison motivates a refined analysis for the optimization time of the (1+\lambda) EA on LeadingOnes.

• [cs.RO]Towards Robotic Eye Surgery: Marker-free, Online Hand-eye Calibration using Optical Coherence Tomography Images
Mingchuan Zhou, Mahdi Hamad, Jakob Weiss, Abouzar Eslami, Kai Huang, Mathias Maier, Chris P. Lohmann, Nassir Navab, Alois Knoll, M. Ali Nasseri
http://arxiv.org/abs/1808.05805v1

Ophthalmic microsurgery is known to be a challenging operation, which requires very precise and dexterous manipulation. Image guided robot-assisted surgery (RAS) is a promising solution that brings significant improvements in outcomes and reduces the physical limitations of human surgeons. However, this technology must be further developed before it can be routinely used in clinics. One of the problems is the lack of proper calibration between the robotic manipulator and appropriate imaging device. In this work, we developed a flexible framework for hand-eye calibration of an ophthalmic robot with a microscope-integrated Optical Coherence Tomography (MIOCT) without any markers. The proposed method consists of three main steps: a) we estimate the OCT calibration parameters; b) with micro-scale displacements controlled by the robot, we detect and segment the needle tip in 3D-OCT volume; c) we find the transformation between the coordinate system of the OCT camera and the coordinate system of the robot. We verified the capability of our framework in ex-vivo pig eye experiments and compared the results with a reference method (marker-based). In all experiments, our method showed a small difference from the marker based method, with a mean calibration error of 9.2 \mum and 7.0 \mum, respectively. Additionally, the noise test shows the robustness of the proposed method.

• [cs.SD]Quality-Net: An End-to-End Non-intrusive Speech Quality Assessment Model based on BLSTM
Szu-Wei Fu, Yu Tsao, Hsin-Te Hwang, Hsin-Min Wang
http://arxiv.org/abs/1808.05344v2

Nowadays, most of the objective speech quality assessment tools (e.g., perceptual evaluation of speech quality (PESQ)) are based on the comparison of the degraded/processed speech with its clean counterpart. The need of a "golden" reference considerably restricts the practicality of such assessment tools in real-world scenarios since the clean reference usually cannot be accessed. On the other hand, human beings can readily evaluate the speech quality without any reference (e.g., mean opinion score (MOS) tests), implying the existence of an objective and non-intrusive (no clean reference needed) quality assessment mechanism. In this study, we propose a novel end-to-end, non-intrusive speech quality evaluation model, termed Quality-Net, based on bidirectional long short-term memory. The evaluation of utterance-level quality in Quality-Net is based on the frame-level assessment. Frame constraints and sensible initializations of forget gate biases are applied to learn meaningful frame-level quality assessment from the utterance-level quality label. Experimental results show that Quality-Net can yield high correlation to PESQ (0.9 for the noisy speech and 0.84 for the speech processed by speech enhancement). We believe that Quality-Net has potential to be used in a wide variety of applications of speech signal processing.

• [cs.SI]Bus transport network analysis in Rio de Janeiro based on topological models using Social Networks
Louise Pumar, Rafael Barbastefano, Diego Carvalho
http://arxiv.org/abs/1808.05692v1

In recent years, studies on public transport networks have intensified in the Social Networks field, especially in bus networks. This has occurred because of the relevance of urban mobility for the proper functioning of a city. The Rio de Janeiro city, Brazil, has undergone recent changes in the city of Rio de Janeiro, Brazil, has undergone recent changes in its municipal bus system, modifying several lines and bus stops. This paper proposes to analyze the structure of the bus transportation network of this city, comparing its topology in 2014 and 2016, before and after the change. For this, the properties of the bus system were investigated based on the topological models B-space, P-space, and C-space. Some essential parameters were calculated, such as the giant component, distance, diameter, degree, closeness, and betweenness. The results showed that there was a reduction of 22.75% of the lines and 5.19% of the bus stops from 2014 to 2016. It was also verified that a maximum of four lines is required to move between any two bus stops within the city in both years. However, with three lines is possible to reach more than 99% of the bus stops. Besides, this study also suggests exploring the C-space network according to a minimum number of frequent bus stops that the lines had. Based on the component giant analysis of these networks with many common points, it is possible to detect possible expressway corridors.

• [cs.SI]Characterizing the public perception of WhatsApp through the lens of media
Josemar Alves Caetano, Gabriel Magno, Evandro Cunha, Wagner Meira Jr., Humberto T. Marques-Neto, Virgilio Almeida
http://arxiv.org/abs/1808.05927v1

WhatsApp is, as of 2018, a significant component of the global information and communication infrastructure, especially in developing countries. However, probably due to its strong end-to-end encryption, WhatsApp became an attractive place for the dissemination of misinformation, extremism and other forms of undesirable behavior. In this paper, we investigate the public perception of WhatsApp through the lens of media. We analyze two large datasets of news and show the kind of content that is being associated with WhatsApp in different regions of the world and over time. Our analyses include the examination of named entities, general vocabulary, and topics addressed in news articles that mention WhatsApp, as well as the polarity of these texts. Among other results, we demonstrate that the vocabulary and topics around the term "whatsapp" in the media have been changing over the years and in 2018 concentrate on matters related to misinformation, politics and criminal scams. More generally, our findings are useful to understand the impact that tools like WhatsApp play in the contemporary society and how they are seen by the communities themselves.

• [eess.AS]Unsupervised adversarial domain adaptation for acoustic scene classification
Shayan Gharib, Konstantinos Drossos, Emre Çakir, Dmitriy Serdyuk, Tuomas Virtanen
http://arxiv.org/abs/1808.05777v1

A general problem in acoustic scene classification task is the mismatched conditions between training and testing data, which significantly reduces the performance of the developed methods on classification accuracy. As a countermeasure, we present the first method of unsupervised adversarial domain adaptation for acoustic scene classification. We employ a model pre-trained on data from one set of conditions and by using data from other set of conditions, we adapt the model in order that its output cannot be used for classifying the set of conditions that input data belong to. We use a freely available dataset from the DCASE 2018 challenge Task 1, subtask B, that contains data from mismatched recording devices. We consider the scenario where the annotations are available for the data recorded from one device, but not for the rest. Our results show that with our model agnostic method we can achieve \sim 10\% increase at the accuracy on an unseen and unlabeled dataset, while keeping almost the same performance on the labeled dataset.

• [eess.IV]Lifted Wasserstein Matcher for Fast and Robust Topology Tracking
Maxime Soler, Mélanie Plainchault, Bruno Conche, Julien Tierny
http://arxiv.org/abs/1808.05870v1

This paper presents a robust and efficient method for tracking topological features in time-varying scalar data. Structures are tracked based on the optimal matching between persistence diagrams with respect to the Wasserstein metric. This fundamentally relies on solving the assignment problem, a special case of optimal transport, for all consecutive timesteps. Our approach relies on two main contributions. First, we revisit the seminal assignment algorithm by Kuhn and Munkres which we specifically adapt to the problem of matching persistence diagrams in an efficient way. Second, we propose an extension of the Wasserstein metric that significantly improves the geometrical stability of the matching of domain-embedded persistence pairs. We show that this geometrical lifting has the additional positive side-effect of improving the assignment matrix sparsity and therefore computing time. The global framework implements a coarse-grained parallelism by computing persistence diagrams and finding optimal matchings in parallel for every couple of consecutive timesteps. Critical trajectories are constructed by associating successively matched persistence pairs over time. Merging and splitting events are detected with a geometrical threshold in a post-processing stage. Extensive experiments on real-life datasets show that our matching approach is an order of magnitude faster than the seminal Munkres algorithm. Moreover, compared to a modern approximation method, our method provides competitive runtimes while yielding exact results. We demonstrate the utility of our global framework by extracting critical point trajectories from various simulated time-varying datasets and compare it to the existing methods based on associated overlaps of volumes. Robustness to noise and temporal resolution downsampling is empirically demonstrated.

• [math.OC]Decentralized Dictionary Learning Over Time-Varying Digraphs
Amir Daneshmand, Ying Sun, Gesualdo Scutari, Francisco Facchinei, Brian M. Sadler
http://arxiv.org/abs/1808.05933v1

This paper studies Dictionary Learning problems wherein the learning task is distributed over a multi-agent network, modeled as a time-varying directed graph. This formulation is relevant, for instance, in Big Data scenarios where massive amounts of data are collected/stored in different locations (e.g., sensors, clouds) and aggregating and/or processing all data in a fusion center might be inefficient or unfeasible, due to resource limitations, communication overheads or privacy issues. We develop a unified decentralized algorithmic framework for this class of nonconvex problems, and we establish its asymptotic convergence to stationary solutions. The new method hinges on Successive Convex Approximation techniques, coupled with a decentralized tracking mechanism aiming at locally estimating the gradient of the smooth part of the sum-utility. To the best of our knowledge, this is the first provably convergent decentralized algorithm for Dictionary Learning and, more generally, bi-convex problems over (time-varying) (di)graphs.

• [math.ST]Inconsistency of diagonal scaling under high-dimensional limit: a replica approach
Tomonari Sei
http://arxiv.org/abs/1808.05781v1

In this note, we claim that diagonal scaling of a sample covariance matrix is asymptotically inconsistent if the ratio of the dimension to the sample size converges to a positive constant, where population is assumed to be Gaussian with a spike covariance model. Our non-rigorous proof relies on the replica method developed in statistical physics. In contrast to similar results known in literature on principal component analysis, the strong inconsistency is not observed. Numerical experiments support the derived formulas.

• [math.ST]Mixed-Level Column Augmented Uniform Designs
Feng Yang, Yong-Dao Zhou, Aijun Zhang
http://arxiv.org/abs/1808.05923v1

In follow-up designs, some additional factors with two or three levels may be added in the follow-up stage since they are quite important but neglected in the first stage. Such follow-up designs are called mixed-level column augmented designs. In this paper, based on the initial designs, mixed-level column augmented uniform designs are proposed by using the uniformity criterion such as wrap-around L2-discrepancy (WD). The additional factors can be three-level or blocking, where the additional blocking factors can be regarded as additional common two-level factors. The multi-stage augmented situation is also investigated. We present the analytical expressions and the corresponding lower bounds of the WD of the column augmented designs. It is interesting to show that the column augmented uniform designs are also the optimal designs under the non-orthogonality criterion, E(f_{NOD}). Furthermore, a construction algorithm for the column augmented uniform design is provided. Some examples show that the lower bounds are tight and the construction algorithm is effective.

• [math.ST]Modelling Persistence Diagrams with Planar Point Processes, and Revealing Topology with Bagplots
Robert J Adler, Sarit Agami
http://arxiv.org/abs/1808.05655v1

We introduce a new model for planar point point processes, with the aim of capturing the structure of point interaction and spread in persistence diagrams. Persistence diagrams themselves are a key tool of TDA (topological data analysis), crucial for the delineation and estimation of global topological structure in large data sets. To a large extent, the statistical analysis of persistence diagrams has been hindered by difficulties in providing replications, a problem that was addressed in an earlier paper, which introduced a procedure called RST (replicating statistical topology). Here we significantly improve on the power of RST via the introduction of a more realistic class of models for the persistence diagrams. In addition, we introduce to TDA the idea of bagplotting, a powerful technique from non-parametric statistics well adapted for differentiating between topologically significant points, and noise, in persistence diagrams. Outside the setting of TDA, our model provides a setting for fashioning point processes, in any dimension, in which both local interactions between the points, along with global restraints on the general point cloud, are important and perhaps competing.

• [math.ST]Non-Asymptotic Behavior of the Maximum Likelihood Estimate of a Discrete Distribution
Sina Molavipour, Germán Bassi, Mikael Skoglund
http://arxiv.org/abs/1808.05771v1

In this paper, we study the maximum likelihood estimate of the probability mass function (pmf) of n independent and identically distributed (i.i.d.) random variables, in the non-asymptotic regime. We are interested in characterizing the Neyman--Pearson criterion, i.e., the log-likelihood ratio for testing a true hypothesis within a larger hypothesis. Wilks' theorem states that this ratio behaves like a \chi^2 random variable in the asymptotic case; however, less is known about the precise behavior of the ratio when the number of samples is finite. In this work, we find an explicit bound for the difference between the cumulative distribution function (cdf) of the log-likelihood ratio and the cdf of a \chi^2 random variable. Furthermore, we show that this difference vanishes with a rate of order 1/\sqrt{n} in accordance with Wilks' theorem.

• [math.ST]Weak convergences of marked empirical processes with applications to goodness-of-fit tests for Markovian processes
Koji Tsukuda, Yoichi Nishiyama
http://arxiv.org/abs/1808.05925v1

In this paper, weak convergences of marked empirical processes in L^2(\mathbb{R},\nu) and their applications to statistical goodness-of-fit tests are provided, where L^2(\mathbb{R},\nu) is a set of equivalence classes of the square integrable functions on \mathbb{R} with respect to a finite Borel measure \nu. The results obtained in our framework of weak convergences are, in the topological sense, weaker than those in the Skorokhod topology on a space of c'adl'ag functions or the uniform topology on a space of bounded functions, which have been well studied in previous works. However, our results have the following merits: (1) avoiding a smoothness condition which sometimes does not hold for some time series models appearing in statistics; (2) treating a weight function which make us possible to propose an Anderson--Darling type test statistics for goodness-of-fit tests. Indeed, the applications presented in this paper are novel.

• [physics.soc-ph]Efficient sampling of spreading processes on complex networks using a composition and rejection algorithm
Guillaume St-Onge, Jean-Gabriel Young, Laurent Hébert-Dufresne, Louis J. Dubé
http://arxiv.org/abs/1808.05859v1

Efficient stochastic simulation algorithms are of paramount importance to the study of spreading phenomena on complex networks. Using insights and analytical results from network science, we discuss how the structure of contacts affects the efficiency of current algorithms. We show that algorithms believed to require \mathcal{O}(\log N) or even \mathcal{O}(1) operations per update---where N is the number of nodes---display instead a polynomial scaling for networks that are either dense or sparse and heterogeneous. This significantly affects the required computation time for simulations on large networks. To circumvent the issue, we propose a node-based method combined with a composition and rejection algorithm, a sampling scheme that has an average-case complexity of \mathcal{O} [\log(\log N)] per update for general networks. This systematic approach is first set-up for Markovian dynamics, but can also be adapted to a number of non-Markovian processes and can enhance considerably the study of a wide range of dynamics on networks.

• [physics.soc-ph]Large-scale estimation of parking requirements for autonomous mobility on demand systems
Daniel Kondor, Paolo Santi, Kakali Basak, Xiaohu Zhang, Carlo Ratti
http://arxiv.org/abs/1808.05935v1

Cities everywhere are anticipating new mobility technologies to help solve issues with congestion and pollution while providing afforable, accessible, reliable and convenient transportation for growing populations. The adoption of self-driving vehicles is projected to happen soon and help with achieving these goals, especially if part of a shared mobility on demand service. Potential benefits of such a system include a reduction of the number of vehicles and freeing up parking spaces, while challenges still include managing the traffic volume. Previous research focused on estimating fleet size in different scenarios. In this work, we focus on estimating minimum fleet size, parking needs and total travel distance for an autonomous mobility on demand solution serving all trips made in private vehicles in Singapore, generated from a comprehensive simulation of the city's mobility. We specifically focus on parking demand as currently a significant amount of space has to be designated as parking in cities, which is poised to become obsolate if people switch from private vehicles to shared ones which are utilized much more efficiently. We show that over 85% reduction in the number of vehicles and parking spaces can be achieved while serving all trips made currently in private vehicles. We further show that potential increased traffic volume can be mitigated with the incorporation of ride-sharing, while offering even higher savings, up to 92% in both fleet size and parking needs.

• [q-bio.OT]The Function Transformation Omics - Funomics
Yongshuai Jiang, Jing Xu, Simeng Hu, Di Liu, Linna Zhao, Xu Zhou
http://arxiv.org/abs/1808.05766v1

There are no two identical leaves in the world, so how to find effective markers or features to distinguish them is an important issue. Function transformation, such as f(x,y) and f(x,y,z), can transform two, three, or multiple input/observation variables (in biology, it generally refers to the observed/measured value of biomarkers, biological characteristics, or other indicators) into a new output variable (new characteristics or indicators). This provided us a chance to re-cognize objective things or relationships beyond the original measurements. For example, Body Mass Index, which transform weight and high into a new indicator BMI=x/y^2 (where x is weight and y is high), is commonly used in to gauge obesity. Here, we proposed a new system, Funomics (Function Transformation Omics), for understanding the world in a different perspective. Funome can be understood as a set of math functions consist of basic elementary functions (such as power functions and exponential functions) and basic mathematical operations (such as addition, subtraction). By scanning the whole Funome, researchers can identify some special functions (called handsome functions) which can generate the novel important output variable (characteristics or indicators). We also start "the Funome project" to develop novel methods, function library and analysis software for Funome studies. The Funome project will accelerate the discovery of new useful indicators or characteristics, will improve the utilization efficiency of directly measured data, and will enhance our ability to understand the world. The analysis tools and data resources about the Funome project can be found gradually at http://www.funome.com.

• [q-bio.QM]Using path signatures to predict a diagnosis of Alzheimer's disease
P. J. Moore, J. Gallacher, T. J. Lyons
http://arxiv.org/abs/1808.05865v1

The path signature is a means of feature generation that can encode nonlinear interactions in the data as well as the usual linear features. It can distinguish the ordering of time-sequenced changes: for example whether or not the hippocampus shrinks fast, then slowly or the converse. It provides interpretable features and its output is a fixed length vector irrespective of the number of input points so it can encode longitudinal data of varying length and with missing data points. In this paper we demonstrate the path signature in providing features to distinguish a set of people with Alzheimer's disease from a matched set of healthy individuals. The data used are volume measurements of the whole brain, ventricles and hippocampus from the Alzheimer's Disease Neuroimaging Initiative (ADNI). The path signature method is shown to be a useful tool for the processing of sequential data which is becoming increasingly available as monitoring technologies are applied.

• [stat.ME]Data Consistency Approach to Model Validation
Andreas Svensson, Dave Zachariah, Petre Stoica, Thomas B. Schön
http://arxiv.org/abs/1808.05889v1

In scientific inference problems, the underlying statistical modeling assumptions have a crucial impact on the end results. There exist, however, only a few automatic means for validating these fundamental modelling assumptions. The contribution in this paper is a general criterion to evaluate the consistency of a set of statistical models with respect to observed data. This is achieved by automatically gauging the models' ability to generate data that is similar to the observed data. Importantly, the criterion follows from the model class itself and is therefore directly applicable to a broad range of inference problems with varying data types. The proposed data consistency criterion is illustrated and evaluated using three synthetic and two real data sets.

• [stat.ME]Estimating and accounting for unobserved covariates in high dimensional correlated data
Chris McKennan, Dan Nicolae
http://arxiv.org/abs/1808.05895v1

Many high dimensional and high-throughput biological datasets have complex sample correlation structures, which include longitudinal and multiple tissue data, as well as data with multiple treatment conditions or related individuals. These data, as well as nearly all high-throughput `omic' data, are influenced by technical and biological factors unknown to the researcher, which, if unaccounted for, can severely obfuscate estimation and inference on effects due to the known covariate of interest. We therefore developed CBCV and CorrConf: provably accurate and computationally efficient methods to choose the number of and estimate latent confounding factors present in high dimensional data with correlated or nonexchangeable residuals. We demonstrate each method's superior performance compared to other state of the art methods by analyzing simulated multi-tissue gene expression data and identifying sex-associated DNA methylation sites in a real, longitudinal twin study. As far as we are aware, these are the first methods to estimate the number of and correct for latent confounding factors in data with correlated or nonexchangeable residuals. An R-package is available for download at https://github.com/chrismckennan/CorrConf.

• [stat.ME]Statistical modeling for adaptive trait evolution in randomly evolving environment
Dwueng-Chwuan Jhwueng
http://arxiv.org/abs/1808.05878v1

In past decades, Gaussian processes has been widely applied in studying trait evolution using phylogenetic comparative analysis. In particular, two members of Gaussian processes: Brownian motion and Ornstein-Uhlenbeck process, have been frequently used to describe continuous trait evolution. Under the assumption of adaptive evolution, several models have been created around Ornstein-Uhlenbeck process where the optimum \theta^y_t of a single trait y_t is influenced with predictor x_t. Since in general the dynamics of rate of evolution \tau^y_t of trait could adopt a pertinent process, in this work we extend models of adaptive evolution by considering the rate of evolution \tau_t^y following the Cox-Ingersoll-Ross (CIR) process. We provide a heuristic Monte Carlo simulation scheme to simulate trait along the phylogeny as a structure of dependence among species. We add a framework to incorporate multiple regression with interaction between optimum of the trait and its potential predictors. Since the likelihood function for our models are intractable, we propose the use of Approximate Bayesian Computation (ABC) for parameter estimation and inference. Simulation as well as empirical study using the proposed models are also performed and carried out to validate our models and for practical applications.

• [stat.ML]A bagging and importance sampling approach to Support Vector Machines
R. Bárcenas, M. D. Gónzalez--Lima, A. J. Quiroz
http://arxiv.org/abs/1808.05917v1

An importance sampling and bagging approach to solving the support vector machine (SVM) problem in the context of large databases is presented and evaluated. Our algorithm builds on the nearest neighbors ideas presented in Camelo at al. (2015). As in that reference, the goal of the present proposal is to achieve a faster solution of the SVM problem without a significance loss in the prediction error. The performance of the methodology is evaluated in benchmark examples and theoretical aspects of subsample methods are discussed.

• [stat.ML]An N Time-Slice Dynamic Chain Event Graph
Rodrigo A. Collazo, Jim Q. Smith
http://arxiv.org/abs/1808.05726v1

The Dynamic Chain Event Graph (DCEG) is able to depict many classes of discrete random processes exhibiting asymmetries in their developments and context-specific conditional probabilities structures. However, paradoxically, this very generality has so far frustrated its wide application. So in this paper we develop an object-oriented method to fully analyse a particularly useful and feasibly implementable new subclass of these graphical models called the N Time-Slice DCEG (NT-DCEG). After demonstrating a close relationship between an NT-DCEG and a specific class of Markov processes, we discuss how graphical modellers can exploit this connection to gain a deep understanding of their processes. We also show how to read from the topology of this graph context-specific independence statements that can then be checked by domain experts. Our methods are illustrated throughout using examples of dynamic multivariate processes describing inmate radicalisation in a prison.

• [stat.ML]Correlated Multi-armed Bandits with a Latent Random Source
Samarth Gupta, Gauri Joshi, Osman Yağan
http://arxiv.org/abs/1808.05904v1

We consider a novel multi-armed bandit framework where the rewards obtained by pulling the arms are functions of a common latent random variable. The correlation between arms due to the common random source can be used to design a generalized upper-confidence-bound (UCB) algorithm that identifies certain arms as non-competitive, and avoids exploring them. As a result, we reduce a K-armed bandit problem to a C+1-armed problem, where C+1 includes the best arm and C competitive arms. Our regret analysis shows that the competitive arms need to be pulled \mathcal{O}(\log T) times, while the non-competitive arms are pulled only \mathcal{O}(1) times. As a result, there are regimes where our algorithm achieves a \mathcal{O}(1) regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms. We also evaluate lower bounds on the expected regret and prove that our correlated-UCB algorithm is order-wise optimal.

• [stat.ML]Learning Supervised Topic Models for Classification and Regression from Crowds
Filipe Rodrigues, Mariana Lourenço, Bernardete Ribeiro, Francisco Pereira
http://arxiv.org/abs/1808.05902v1

The growing need to analyze large collections of documents has led to great developments in topic modeling. Since documents are frequently associated with other related variables, such as labels or ratings, much interest has been placed on supervised topic models. However, the nature of most annotation tasks, prone to ambiguity and noise, often with high volumes of documents, deem learning under a single-annotator assumption unrealistic or unpractical for most real-world applications. In this article, we propose two supervised topic models, one for classification and another for regression problems, which account for the heterogeneity and biases among different annotators that are encountered in practice when learning from crowds. We develop an efficient stochastic variational inference algorithm that is able to scale to very large datasets, and we empirically demonstrate the advantages of the proposed model over state-of-the-art approaches.

• [stat.ML]Multiview Boosting by Controlling the Diversity and the Accuracy of View-specific Voters
Anil Goyal, Emilie Morvant, Pascal Germain, Massih-Reza Amini
http://arxiv.org/abs/1808.05784v1

In this paper we propose a boosting based multiview learning algorithm, referred to as PB-MVBoost, which iteratively learns i) weights over view-specific voters capturing view-specific information; and ii) weights over views by optimizing a PAC-Bayes multiview C-Bound that takes into account the accuracy of view-specific classifiers and the diversity between the views. We derive a generalization bound for this strategy following the PAC-Bayes theory which is a suitable tool to deal with models expressed as weighted combination over a set of voters. Different experiments on three publicly available datasets show the efficiency of the proposed approach with respect to state-of-art models.

• [stat.ML]Randomized Least Squares Regression: Combining Model- and Algorithm-Induced Uncertainties
Jocelyn T. Chi, Ilse C. F. Ipsen
http://arxiv.org/abs/1808.05924v1

We analyze the uncertainties in the minimum norm solution of full-rank regression problems, arising from Gaussian linear models, computed by randomized (row-wise sampling and, more generally, sketching) algorithms. From a deterministic perspective, our structural perturbation bounds imply that least squares problems are less sensitive to multiplicative perturbations than to additive perturbations. From a probabilistic perspective, our expressions for the total expectation and variance with regard to both model- and algorithm-induced uncertainties, are exact, hold for general sketching matrices, and make no assumptions on the rank of the sketched matrix. The relative differences between the total bias and variance on the one hand, and the model bias and variance on the other hand, are governed by two factors: (i) the expected rank deficiency of the sketched matrix, and (ii) the expected difference between projectors associated with the original and the sketched problems. A simple example, based on uniform sampling with replacement, illustrates the statistical quantities.

相关文章

网友评论

    本文标题:今日学术视野(2018.8.21)

    本文链接:https://www.haomeiwen.com/subject/zxzriftx.html