CVPR2018-papers

Transductive Unbiased Embedding for Zero-Shot Learning
Frustum PointNets for 3D Object Detection from RGB-D Data
Enhancing the Spatial Resolution of Stereo Images using a Parallax Prior
DiverseNet: When One Right Answer Is Not Enough
SSNet: Scale Selection Network for Online 3D Action Prediction
Very Large-Scale Global SfM by Distributed Motion Averaging
PAD-Net: Multi-Tasks Guided Prediciton-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing
Dynamic Feature Learning for Partial Face Recognition
Context-aware Deep Feature Compression for High-speed Visual Tracking
Between-class Learning for Image Classification
DVQA: Understanding Data Visualizations via Question Answering
Human Appearance Transfer
Learning to Segment Every Thing
Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation
Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation
Learning to Compare: Relation Network for Few-Shot Learning
Arbitrary Style Transfer with Deep Feature Reshuffle
Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
Guided Proofreading of Automatic Segmentations for Connectomics
Deep PhaseNet for Video Frame Interpolation
Context-aware Synthesis for Video Frame Interpolation
Lean Multiclass Crowdsourcing
Unsupervised Deep Generative Adversarial Hashing Network
R-FCN-3000 at 30fps: Decoupling Detection and Classification
Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
Gated Fusion Network for Single Image Dehazing
Learning a Complete Image Indexing Pipeline
Mask-guided Contrastive Attention Model for Person Re-Identification
Learning Pose Specific Representations by Predicting different Views
Deep Mutual Learning
Improving Occlusion and Hard Negative Handling for Single-Stage Object Detectors
Defense against adversarial attacks using guided denoiser
Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking
Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships
Decorrelated Batch Normalization
On the Duality Between Retinex and Image Dehazing
CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
The Perception-Distortion Tradeoff
Image Blind Denoising With Generative Adversarial Network Based Noise Modeling
Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning
A Low Power, High Throughput, Fully Event-Based Stereo System
Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
End-to-end Flow Correlation Tracking with Spatial-temporal Attention
Exploiting Transitivity for Learning Person Re-identification Models on a Budget
Imagination-IQA: No-reference Image Quality Assessment via Adversarial Learning
Egocentric Activity Recognition on a Budget
Person Transfer GAN to Bridge Domain Gap for Person Re-Identification
Duplex Generative Adversarial Network for Unsupervised Domain Adaptation
Fine-grained Video Captioning for Sports Narrative
High Performance Visual Tracking with Siamese Region Proposal Network
Adversarially Occluded Samples for Person Re-identification
MatNet: Modular Attention Network for Referring Expression Comprehension
Low-Latency Video Semantic Segmentation
MapNet: An Allocentric Spatial Memory for Mapping Environments
Fast End-to-End Trainable Guided Filter
Partial Transfer Learning with Selective Adversarial Networks
Reconstruction Network for Video Captioning
Improving Landmark Localization with Semi-Supervised Learning
Unsupervised Person Image Synthesis in Arbitrary Poses
Efficient Large-scale Approximate Nearest Neighbor Search on OpenCL FPGA
Deep End-to-End Time-of-Flight Imaging
Augmenting Crowd-Sourced 3D Reconstructions using Semantic Detections
DocUNet: Document Image Unwarping via A Stacked U-Net
Geometry Aware Optimization for Deep Learning: The Good Practice
Learning to Detect Features in Texture Images
LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation
Spatially-Adaptive Filter Units for Deep Neural Networks
Revisiting Video Saliency: A Large-scale Benchmark and a New Model
Real-World Repetition Estimation by Div, Grad and Curl
Learning Visual Knowledge Memory Networks for Visual Question Answering
Attention-aware Compositional Network for Person Re-Identification
Sim2Real View Invariant Visual Servoing by Recurrent Control
Time-resolved Light Transport Decomposition for Thermal Photometric Stereo
Trapping Light for Time of Flight
A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation
Global versus Localized Generative Adversarial Nets
Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
Learning a Toolchain for Image Restoration
CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition
Feature Quantization for Defending Against Distortion of Images
A Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds
Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation
Aperture Supervision for Monocular Depth Estimation
Divide and Conquer for Full-Resolution Light Field Deblurring
Multi-shot Pedestrian Re-identification via Sequential Decision Making
Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features
Depth-Aware Stereo Video Retargeting
Multistage Adversarial Losses for Pose-Based Human Image Synthesis
Multi-Content GAN for Few-Shot Font Style Transfer
Multi-Cue Correlation Filters for Robust Visual Tracking
A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects
Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
Improving Color Reproduction Accuracy in the Camera Imaging Pipeline
Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks
Sketch-a-Classifier: Sketch-based Photo Classifier Generation
Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks
TOM-Net: Learning Transparent Object Matting from a Single Image
Estimation of Camera Locations in Highly Corrupted Scenarios: All About the Base, No Shape Trouble
Direction-aware Spatial Context Features for Shadow Detection
Neural Motifs: Scene Graph Parsing with Global Context
Object Referring in Videos with Language and Human Gaze
Learning Transferable Architectures for Scalable Image Recognition
View Extrapolation of Human Body from a Single Image
Probabilistic Plant Modeling via Multi-View Image-to-Image Translation
Learning a Discriminative Prior for Blind Image Deblurring
Optimal Structured Light a la Carte
Revisiting Deep Intrinsic Image Decompositions
GAGAN: Geometry Aware Generative Adverserial Networks
Learning Multi-grid Generative ConvNets by Minimal Contrastive Divergence
Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification
Variational Autoencoders for Deforming 3D Mesh Models
Rotation Averaging and Strong Duality
3D Hand Pose Estimation: From Current Achievements to Future Goals
Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions
A Robust Generative Framework for Generalized Zero-Shot Learning
Two can play this Game: Visual Dialog with Discriminative Visual Question Generation and Visual Question Answering
Rotation-sensitive Regression for Oriented Scene Text Detection
Adversarial Feature Augmentation for Unsupervised Domain Adaptation
Deep Regression Forests for Age Estimation
FOTS: Fast Oriented Text Spotting with a Unified Network
SoS-RSC: A Sum-of-Squares Polynomial Approach to Robustifying Subspace Clustering Algorithms
Efficient Subpixel Refinement with Symbolic Linear Predictors
Self-Supervised Feature Learning by Learning to Spot Artifacts
PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
Scale-recurrent Network for Deep Image Deblurring
Multi-Cell Classification by Convolutional Dictionary Learning with Class Proportion Priors
Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
On the convergence of PatchMatch and its variants
Clinical Skin Lesion Diagnosis using Representations Inspired by Dermatologist Criteria
PoTion: Pose MoTion Representation for Action Recognition
Zigzag Learning for Weakly Supervised Object Detection
VITAL: VIsual Tracking via Adversarial Learning
Crowd Counting with Deep Negative Correlation Learning
Multi-Label Zero-Shot Learning with Structured Knowledge Graphs
Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition
A Closer Look at Spatiotemporal Convolutions for Action Recognition
Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification
End-to-End Deep Kronecker-Product Matching for Person Re-identification
Consensus Maximization for Semantic Region Correspondences
SBNet: Sparse Block’s Network for Fast Inference
Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints
Group Consistent Similarity Learning via Deep CRFs for Person Re-Identification
Now You Shake Me: Towards Automatic 4D Cinema
Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network
Interpret Neural Networks by Identifying Critical Data Routing Paths
Deep Reinforcement Learning of Region Proposal Networks for Object Detection
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Finding It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video"
Semantic Visual Localization
DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
Composing Two Objects of Interest for Flying Camera Photography
Kernelized Subspace Pooling for Deep Local Descriptors
Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
Deep Lesion Graph in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database
An Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption
Eliminating Background-bias for Robust Person Re-identification
Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View
High-order tensor regularization with application to attribute ranking
Taskonomy: Disentangling Task Transfer Learning
BlockDrop: Dynamic Inference Paths in Residual Networks
Attend and Interact: Higher-Order Object Interactions for Video Understanding
Bilateral Ordinal Relevance Multi-instance Regression for Facial Action Unit Intensity Estimation
CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles
Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification
Large Scale Fine-Grained Categorization and the Effectiveness of Domain-Specific Transfer Learning
BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
Improved Human Pose Estimation through Adversarial Data Augmentation
ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
SINT++: Robust Visual Tracking via Adversarial Hard Positive Generation
Structured Uncertainty Prediction Networks
Geometry-Guided CNN for Self-supervised Video Representation learning
Low-Shot Recognition with Imprinted Weights
Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection
Disentangling Structure and Aesthetics for Content-aware Image Completion
A Volumetric Descriptive Network for 3D Object Synthesis
Interpretable Convolutional Neural Networks
Single Image Dehazing via Conditional Generative Adversarial Network
Neural Inverse Kinematics for Unsupervised Motion Retargetting
Environment Upgrade Reinforcement Learning for Non-differentiable Multi-stage Pipelines
Teaching Categories to Human Learners with Visual Explanations
Facelet-Bank for Fast Portrait Manipulation
Convolutional Sequence to Sequence Model for Human Dynamics
Human Semantic Parsing for Person Re-identification
Latent RANSAC
LiDAR-Video Driving Dataset: Learning Driving Policies Effectively
Actor and Observer: Joint Modeling of First and Third-Person Videos
Controllable Video Generation with Sparse Trajectories
What have we learned from deep representations for action recognition?
Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning
Language-Based Image Editing with Recurrent attentive Models
Graph-Cut RANSAC
Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition
Memory Based Online Learning of Deep Representations from Video Streams
Deep Layer Aggregation
Learning Convolutional Networks for Content-weighted Image Compression
Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250Hz
Efficient, sparse representation of manifold distance matrices for classical scaling
Visual to Sound: Generating Natural Sound for Videos in the Wild
A Prior-Less Method for Multi-Face Tracking in Unconstrained Videos
Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
Self-calibrating polarising radiometric calibration
Pix3D: Dataset and Methods for 3D Object Modeling from a Single Image
Learning to Promote Saliency Detectors
Pose Transferrable Person Re-Identification
Hashing as Tie-Aware Learning to Rank
Baseline Desensitizing In Translation Averaging
Conditional Image-to-Image Translation
Blind Predicting Similar Quality Map for Image Quality Assessment
Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
CNN Driven Sparse Multi-Level B-spline Image Registration
Through-Wall Human Pose Estimation Using Radio Signals
xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization
FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
Weakly Supervised Coupled Networks for Visual Sentiment Analysis
Ring loss: Convex Feature Normalization for Face Recognition
Fast Spectral Ranking for Similarity Search
PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
AMNet: Memorability Estimation with Attention
Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification
End-to-End Learning of Motion Representation for Video Understanding
Smooth Neighbors on Teacher Graphs for Semi-supervised Learning
SeedNet : Automatic Seed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation
Deep Spatio-Temporal Random Fields for Efficient Video Segmentation
Perturbative Neural Networks: Rethinking Convolution in CNNs
SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
Neural 3D Mesh Renderer
Deep Parametric Continuous Convolutional Neural Networks
Visual Question Reasoning on General Dependency Tree
Non-local Neural Networks
Light field intrinsics with a deep encoder-decoder network
Feature Space Transfer for Data Augmentation
Motion Segmentation by Exploiting Complementary Geometric Models
Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation
Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks
Few-Shot Image Recognition by Predicting Parameters from Activations
Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation
CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition
Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
Deep Cross-media Knowledge Transfer
Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos
Recurrent Slice Networks for 3D Segmentation on Point Clouds
Dimensionalitys Blessing: Detecting the distributions underlying images
Augmented Skeleton Space Transfer for Depth-based Hand Pose Estimation
Robust Classification with Convolutional Prototype Learning
DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation
ICE-BA: Efficient, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM
Grounding Referring Expressions in Images by Variational Context
Pseudo-Mask Augmented Object Detection
Improvements to context based self-supervised learning
Left-Right Comparative Recurrent Model for Stereo Matching
Learning deep structured active contours end-to-end
Efficient and Deep Person Re-Identification using Multi-Level Similarity
Learning Intrinsic Image Decomposition from Watching the World
Learning to Understand Image Blur
Gaze Prediction in Dynamic $360^\circ$ Immersive Videos
Emotional Attention: A Study of Image Sentiment and Visual Attention
Single View Stereo Matching
Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
Video Representation Learning Using Discriminative Pooling
Probabilistic Joint Face-Skull Modelling for Facial Reconstruction
Indoor RGB-D Compass from a Single Line and Plane
pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
Generative Adversarial Learning Towards Fast Weakly Supervised Detection
Seeing Temporal Modulation of Lights from Standard Cameras
Shape from Shading through Shape Evolution
Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
Neural Style Transfer via Meta Networks
UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
Cascaded Pyramid Network for Multi-Person Pose Estimation
Detect-and-Track: Efficient Pose Estimation in Videos
SobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion
NAG: Network for Adversary Generation
Inferring Co-Attention in Social Scene Videos
Unsupervised Learning of Single View Depth Estimation and Visual Odometry with Deep Feature Reconstruction
Egocentric Basketball Motion Planning from a Single First-Person Image
Geometric robustness of deep networks: analysis and improvement
Pose-Guided Photorealistic Face Rotation
Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
Importance Weighted Adversarial Nets for Partial Domain Adaptation
Towards High Performance Video Object Detection
SurfConv: Bridging 3D and 2D Convolution for RGBD Images
People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting
Fully Convolutional Adaptation Networks for Semantic Segmentation
Towards Pose Invariant Face Recognition in the Wild
Interactive Image Segmentation with Latent Diversity
Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images
Detecting and Recognizing Human-Object Interactions
Deep Image Prior
2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning
Direct Shape Regression Networks for End-to-End Face Alignment
Disentangling Features in 3D Face Shapes for Joint Face Reconstruction and Recognition
Scale-Transferrable Object Detection
Learning by Asking Questions
3D Pose Estimation and 3D Model Retrieval for Objects in the Wild
Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition
Future Person Localization in First-Person Videos
3D-RCNN: Instance-level 3D Scene Understanding via Render-and-Compare
Manifold Learning in Quotient Spaces
Image Correction via Deep Reciprocating HDR Transformation
Focus Manipulation Detection via Photometric Histogram Analysis
Density Adaptive Point Set Registration
Multi-view Harmonized Bilinear Network for 3D Object Recognition
SeGAN: Segmenting and Generating the Invisible
VizWiz Grand Challenge: Answering Visual Questions from Blind People
Sparse, Smart Contours to Represent and Edit Images
Generative Non-Rigid Shape Completion with Graph Convolutional Autoencoders
The power of ensembles for active learning in image classification
OLÉ: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning
Learning Compositional Visual Concepts with Mutual Consistency
Adversarial Complementary Learning for Weakly Supervised Object Localization
Analytical Modeling of Vanishing Points and Curves in Catadioptric Cameras
Exploit the Unknown Gradually:~ One-Shot Video-Based Person Re-Identification by Stepwise Learning
Learning to Sketch with Shortcut Cycle Consistency
Domain Adaptive Faster R-CNN for Object Detection in the Wild
Attentive Generative Adversarial Network for Raindrop Removal from A Single Image
Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
Making Convolutional Networks Recurrent for Visual Sequence Learning
Multi-Task Adversarial Network for Disentangled Feature Learning
Fight ill-posedness with ill-posedness: Single-shot variational depth super-resolution from shading
Zero-Shot Sketch-Image Hashing
Learning to Localize Sound Source in Visual Scenes
Cross-Domain Weakly-Supervised Object Detection through Progressive Domain Adaptation
Semi-parametric Image Synthesis
Multi-scale Location-aware Kernel Representation for Object Detection
W2F: A Weakly-Supervised to Fully-Supervised Framework for Object Detection
Generative Modeling using the Sliced Wasserstein Distance
MX-LSTM: mixing tracklets and vislets to jointly forecast trajectories and head poses
Dynamic Video Segmentation Network
Learning a Discriminative Feature Network for Semantic Segmentation
Video Person Re-identification with Competitive Snippet-similarity Aggregation and Co-attentive Snippet Embedding
Curve Reconstruction via the Global Statistics of Natural Curves
Single-Shot Refinement Neural Network for Object Detection
Density-aware Single Image De-raining using a Multi-stream Dense Network
Learning Answer Embeddings for Visual Question Answering
Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
Translating and Segmenting Multimodal Medical Volumes with Cycle- and Shape-Consistency Generative Adversarial Network
Learning from the Deep: A Revised Underwater Image Formation Model
Mean-Variance Loss for Deep Age Estimation from a Face
Disentangled Person Image Generation
Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons
DeepMVS: Learning Multi-View Stereopsis
Embodied Question Answering
Deflecting Adversarial Attacks with Pixel Deflection
Dynamic-Structured Semantic Propagation Network
Integrated facial landmark localization and super-resolution of real-world very low resolution faces in arbitrary poses with GANs
A Two-Step Disentanglement Method
Towards Effective Low-bitwidth Convolutional Neural Networks
Natural and Effective Obfuscation by Head Inpainting
Learning-Compression" algorithms for neural net pruning"
Salient Object Detection Driven by Fixation Prediction
Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective
Uncalibrated Photometric Stereo under Natural Illumination
Learning Monocular 3D Human Pose estimation on weakly-supervised Multi-view Images
An Unsupervised Learning Model for Deformable Medical Image Registration
Learning Deep Correspondence through Prior and Posterior Feature Constancy
Anticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB
A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping
Learned Shape-Tailored Descriptors for Segmentation
One-shot Action Localization by Sequence Matching Network
Robust Physical-World Attacks on Deep Learning Visual Classification
What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
Bidirecional Retrieval Made Simple
Reward Learning by Instruction
MegaDepth: Learning Single-View Depth Prediction from Internet Photos
Cross-Dataset Adaptation for Visual Question Answering
Interpretable Video Captioning via Trajectory Structured Localization
MoCoGAN: Decomposing Motion and Content for Video Generation
Left/Right Asymmetric Layer Skippable Networks
Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation
Unsupervised Discovery of Object Landmarks as Structural Representations
Learning Deep Descriptors with Scale-Aware Triplet Networks
Robust Depth Estimation from Auto Bracketed Images
Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation
Local and Global Optimization Techniques in Graph-based Clustering
Learning from Millions of 3D Scans for Large-scale 3D Face Recognition
CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation
Image Collection Pop-up: 3D Reconstruction and Clustering of Rigid and Non-Rigid Categories
Ordinal Depth Supervision for 3D Human Pose Estimation
Learning to Hash by Discrepancy Minimization
MapNet: Geometry-Aware Learning of Maps for Camera Localization
Im2Struct: Recovering 3D Shape Structure from a Single RGB Image
A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking
Analytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input
Cross-Domain Self-supervised Multi-task Feature Learning Using Synthetic Game Imagery
Coding Kendall's Shape Trajectories for 3D Action Recognition
Camera Pose Estimation with Unknown Principal Point
Learning Spatial-Aware Regressions for Visual Tracking
The Easy, The Medium and The Hard: Adapting Across Varied Domain Shifts
Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation
A Hybrid L1-L0 Layer Decomposition Model for Tone Mapping
LIME: Live Intrinsic Material Estimation
Learning Representations for Single Cells in Microscopy Images
Transparency by Design: Closing the Gap Between Performance and Interpretabilty in Visual Reasoning
clcNet: Improving the Efficiency of Convolutional Neural Network using Channel Local Convolutions
Spanning Patches: Deep Patch Selection for Fast Multi-View Stereo
LAMV: Learning to align and match videos with kernelized temporal layers
Single Image Reflection Separation with Perceptual Losses
Structure from Recurrent Motion: From Rigidity to Recurrency
Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
Relation Networks for Object Detection
An End-to-End TextSpotter with Explicit Alignment and Attention
Photometric Stereo in Participating Media Considering Shape-Dependent Forward Scatter
Sliced Wasserstein Distance for Learning Gaussian Mixture Models
Generative Adversarial Image Synthesis with Decision Tree Latent Controller
Disentangling 3D Pose in A Dendritic CNN for Unconstrained 2D Face Alignment
Learning Multi-Instance Enriched Image Representation via Non-Greedy Simultaneous L1 -Norm Minimization and Maximization
Separating Self-Expression and Visual Content in Hashtag Supervision
Residual Dense Network for Image Super-Resolution
Hand PointNet: 3D Hand Pose Estimation using Point Sets
Human-centric Indoor Scene Synthesis Using Stochastic Grammar
Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering
Occlusion Aware Unsupervised Learning of Optical Flow
Domain Generalization with Adversarial Feature Learning
A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation
PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image
Deep Learning under Privileged Information Using Heteroscedastic Dropout
Frame-Recurrent Video Super-Resolution
Nonlocal Low-Rank Tensor Factor Analysis for Image Restoration
Content-Sensitive Supervoxels via Uniform Tessellations on Video Manifolds
Planar Shape Detection at Structural Scales
Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
Learning to Parse Wireframes in Images of Man-Made Environments
Harmonious Attention Network for Person Re-Identication
Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
Every Smile is Unique: Landmark-guided Diverse Smile Generation
Multi-Scale Weighted Nuclear Norm Image Restoration
FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis
Lightweight Probabilistic Deep Networks
Learning Depth from Monocular Videos using Direct Methods
Thoracic Disease Identification and Localization with Limited Supervision
SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation
Memory Matching Networks for One-Shot Image Recognition
Compressed Video Action Recognition
FFNet: Video Fast-Forwarding via Reinforcement Learning
Representing and Learning High Dimensional Data with the Optimal Transport Map from a Probabilistic Viewpoint
ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
Fully Convolutional Attention Network for Multimodal Reasoning
Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images
Recurrent Pixel Embedding for Instance Grouping
Name-removed-for-review: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection
SGAN: An Alternative Training of Generative Adversarial Networks
Learning Markov Clustering Networks for Scene Text Detection
Occlusion-Aware Rolling Shutter Rectification of 3D Scenes
Beyond Gröbner Bases: Basis Selection for Minimal Solvers
Improving Object Localization with Fitness NMS and Bounded IoU Loss
Generative Adversarial Perturbations
Deep Photo Enhancer: Unsupervised Learning of Image Enhancement from Photographs with GANs
Eye In-Painting with Exemplar Generative Adversarial Networks
Encoder-Decoder Alignment for Zero-Pair Image-to-Image Translation
Learning Structure and Strength of CNN Filters for Small Sample Size Training
Path Aggregation Network for Instance Segmentation
Learning Superpixels with Segmentation-Aware Affinity Loss
Data Distillation: Towards Omni-Supervised Learning
Deep Diffeomorphic Transformer Networks
CodeSLAM --- Learning a Compact, Optimisable Representation for Dense Visual SLAM
Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
Learning Latent Super-Events to Detect Multiple Activities in Videos
MegDet: A Large Mini-Batch Object Detector
Lose The Views: Limited Angle CT Reconstruction via Implicit Sinogram Completion
Unsupervised Domain Adaptation with Similarity-Based Classifier
Visual Feature Attribution using Wasserstein GANs
Tell Me Where To Look: Guided Attention Inference Network
Towards Open-Set Identity Preserving Face Synthesis
Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination
Multi-Evidence Fusion and Filtering for Weakly Supervised Object Recognition, Detection and Segmentation
Deep Material-aware Cross-spectral Stereo Matching
MakeupGAN: Makeup Transfer via Cycle-Consistent Adversarial Networks
M3: Multimodal Memory Modelling for Video Captioning
Fooling Vision and Language Models Despite Localization and Attention Mechanism
Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
Jointly Localizing and Describing Events for Dense Video Captioning
The Best of Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion Segmentation
End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching
LDMNet: Low Dimensional Manifold Regularized Neural Networks
3D Human Pose Estimation in the Wild by Adversarial Learning
Fast Video Object Segmentation by Reference-Guided Mask Propagation
End-to-End Dense Video Captioning with Masked Transformer
Towards dense object tracking in a 2D honeybee hive
Appearance-and-Relation Networks for Video Classification
StarGAN: Unified Generative Adversarial Networks for Controllable Multi-Domain Image-to-Image Translation
Answer with Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering
GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
Weakly Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information
Structured Set Matching Networks for One-Shot Part Labeling
Real-Time Seamless Single Shot 6D Object Pose Prediction
Triplet-Center Loss for Multi-View 3D Object Retrieval
Pixels, voxels, and views: A study of shape representations for single view 3D object shape prediction
Show Me a Story: Towards Coherent Neural Story Illustration
DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map
Missing Slice Recovery for Tensors Using a Low-rank Model in Embedded Space
3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
Link and code: Fast indexing with graphs and compact regression codes
Two-Stream Convolutional Networks for Dynamic Texture Synthesis
Weakly Supervised Action Localization by Sparse Temporal Pooling Network
Viewpoint-aware Video Summarization
4D Human Body Correspondences from Panoramic Depth Maps
Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems
Discovering Point Lights with Intensity Distance Fields
The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
Geometry-aware Deep Network for Single-Image Novel View Synthesis
Temporal Deformable Residual Networks for Action Segmentation in Videos
Seeing Small Faces from Robust Anchor's Perspective
Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
On the Importance of Label Quality for Semantic Segmentation
AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations
Learning Deep Sketch Abstraction
Non-Linear Temporal Subspace Representations for Activity Recognition
A Biresolution Spectral framework for Product Quantization
Unsupervised Cross-dataset Person Re-identification by Transfer Learning of Spatio-temporal Patterns
Feature Super-Resolution: Make Machine See More Clearly
Finding Tiny Faces in the Wild with Generative Adversarial Network
DoubleFusion: Real-time Capture of Human Performance with Inner Body Shape from a Single Depth Sensor
Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction
Recognize Actions by Disentangling Components of Dynamics
Who Let The Dogs Out? Modeling Dog Behavior From Visual Data
Alive Caricature from 2D to 3D
Learning Steerable Filters for Rotation Equivariant CNNs
From source to target and back: Symmetric Bi-Directional Adaptive GAN
Monocular Relative Depth Perception with Web Stereo Data Supervision
Correlation Tracking via Joint Discrimination and Reliability Learning
Boosting Domain Adaptation by Discovering Latent Domains
HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization
Learning from Noisy Web Data with Category-level Supervision
Embodied Real-World Active Perception
Boosting Self-Supervised Learning via Knowledge Transfer
Video Captioning via Hierarchical Reinforcement Learning
Weakly Supervised Phrase Localization with Multi-Scale Anchored Transformer Network
Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection
Wide Compression: Tensor Ring Nets
Demo2Vec: Reasoning Object Affordances from Online Videos
A High-Quality Denoising Dataset for Smartphone Cameras
Collaborative and Adversarial Network for Unsupervised domain adaptation
End-to-end weakly-supervised semantic alignment
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
Feature Selective Networks for Object Detection
Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints
A Common Framework for Interactive Texture Transfer
Depth and Transient Imaging with Compressive SPAD Array Cameras
PointGrid: A Deep Network for 3D Shape Understanding
A Network Architecture for Point Cloud Classification via Automatic Depth Images Generation
Optimizing Local Feature Descriptors for Nearest Neighbor Matching
4DFAB: A Large Scale 4D Database for Facial Expression Analysis and Biometric Applications
Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network
Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal
What do Deep Networks Like to See?
On the Robustness of Semantic Segmentation Models to Adversarial Attacks
SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval
Progressive Attention Guided Recurrent Network for Salient Object Detection
IQA: Visual Question Answering in Interactive Environments
Boosting Adversarial Attacks with Momentum
Conditional Probability Models for Deep Image Compression
Cascade R-CNN: Delving into High Quality Object Detection
Scalable and Effective Deep CCA via Soft Decorrelation
Discriminability objective for training descriptive captions
Going from Image to Video Saliency: Augmenting Image Salience with Dynamic Attentional Push
Recurrent Scene Parsing with Perspective Understanding in the Loop
Semantic Video Segmentation by Gated Recurrent Flow Propagation
FlipDial: A Generative Model for Two-Way Visual Dialogue
Context Encoding for Semantic Segmentation
Deep Marching Cubes: Learning Explicit Surface Representations
Rethinking Feature Distribution for Loss Functions in Image Classification
Optical Flow Guided Feature: A Motion Representation for Video Action Recognition
Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification
Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
Co-Occurrence Template Matching
Defense against Universal Adversarial Perturbations
PPFNet: Global Context Aware Local Features for Robust 3D Point Matching
Dynamic Zoom-in Network for Fast Object Detection in Large Images
Objects as context for detecting their semantic parts
Spline Error Weighting for Robust Visual-Inertial Fusion
GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation
Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network
Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
CondenseNet: An Efficient DenseNet using Learned Group Convolutions
Burst Denoising with Kernel Prediction Networks
Leveraging Unlabeled Data for Crowd Counting by Learning to Rank
Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation
Classifier Learning with Prior Probabilities for Facial Action Unit Recognition
Active Fixation Control to Predict Saccade Sequences
Reflection Removal for Large-Scale 3D Point Clouds
Mesoscopic Facial Geometry inference Using Deep Neural Networks
VITON: An Image-based Virtual Try-on Network
Beyond the Pixel-Wise Loss for Topology-Aware Delineation
HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
A Globally Optimal Solution to the Non-Minimal Relative Pose Problem
Learning distributions of shape trajectories from longitudinal datasets: a hierarchical model on a manifold of diffeomorphisms
Multispectral Image Intrinsic Decomposition via Low Rank Constraint
Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams
Alternating-Stereo VINS: Observability Analysis and Performance Evaluation
Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View
Style Aggregated Network for Facial Landmark Detection
VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors
Deep Adversarial Subspace Clustering
Compassionately Conservative Balanced Cuts for Image Segmentation
Deformable GANs for Pose-based Human Image Generation
Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration
The iNaturalist Species Classification and Detection Dataset
Categorizing Concepts with Basic Level for Vision-to-Language
InverseFaceNet: Deep Monocular Inverse Face Rendering at over 250 Hz
Textbook Question Answering under Teacher Guidance with Memory Networks
Learning to Find Good Correspondences
Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning
Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data
Weakly Supervised Facial Action Unit Recognition through Adversarial Training
Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
Neighbors Do Help: Deeply Exploiting Local Structures of Point Clouds
The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
Dense 3D Regression for Hand Pose Estimation
Detail-Preserving Pooling in Deep Networks
Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation
Reinforcement Cutting-Agent Learning for Video Object Segmentation
SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
Wrapped Gaussian Process Regression on Riemannian Manifolds
Document Enhancement using Visibility Detection
Learning Discriminative Evaluation Metrics for Image Captioning
GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning
Learning Intelligent Dialogs for Bounding Box Annotation
Efficient Diverse Ensemble for Discriminative Co-Tracking
Recovering Realistic Texture in Image Super-resolution by Spatial Feature Modulation
Mining on Manifolds: Metric Learning without Labels
Revisiting knowledge transfer for training object class detectors
GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
Differential Attention for Visual Question Answering
A PID Controller Approach for Stochastic Optimization of Deep Networks
Bootstrapping the Performance of Webly Supervised Semantic Segmentation
Iterative Learning with Open-set Noisy Labels
A Papier-Mâché Approach to Learning 3D Surface Generation
Extreme 3D Face Reconstruction: Looking Past Occlusions
High-speed Tracking with Multi-kernel Correlation Filters
Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
Separating Style and Content for Generalized Style Transfer
Learning Dual Convolutional Neural Networks for Low-Level Vision
Wasserstein Introspective Neural Networks
Deep Semantic Face Deblurring
InLoc: Indoor Visual Localization with Dense Matching and View Synthesis
Temporal Hallucinating for Action Recognition with Few Still Images
Deep Texture Manifold for Ground Terrain Recognition
Discriminative Learning of Latent Features for Zero-Shot Recognition
Neural Sign Language Translation
GroupCap: Group-based Image Captioning with Structured Relevance and Diversity Constraints
Repulsion Loss: Detecting Pedestrians in a Crowd
Pulling Actions out of Context: Explicit Separation for Effective Combination
Deep Group-shuffling Random Walk for Person Re-identification
DenseASPP: Densely Connected Networks for Semantic Segmentation
A Variational U-Net for Conditional Appearance and Shape Generation
Universal Denoising Networks : A Novel CNN-based Network Architecture for Image Denoising
Automatic 3D Indoor Scene Modeling from Single Panorama
Five-point Fundamental Matrix Estimation for Uncalibrated Cameras
PU-Net: Point Cloud Upsampling Network
Generative Image Inpainting with Contextual Attention
Im2Flow: Motion Hallucination from Static Images for Action Recognition
Tagging Like Humans: Diverse and Distinct Image Annotation
TextureGAN: Controlling Deep Image Synthesis with Texture Patches
ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing
Optimizing Video Object Detection via a Scale-Time Lattice
Context Embedding Networks
Motion-Guided Cascaded Refinement Network for Video Object Segmentation
RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
Conditional Generative Adversarial Network for Structured Domain Adaptation
Large-scale Distance Metric Learning with Uncertainty
Hierarchical Novelty Detection for Visual Object Recognition
Deeper Look at Power Normalizations.
Disentangling Factors of Variation by Mixing Them
Beyond Holistic Object Recognition: Enriching Image Understanding with Part States
LSTM Pose Machines
End-to-end Recovery of Human Shape and Pose
Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects
Modulated Convolutional Networks
High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
Learning Compressible 360° Video Isomers
Easy Identification from Better Constraints: Multi-Shot Person Re-Identification from Reference Constraints
TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
Good View Hunting: Learning Photo Composition from 1 Million View Pairs
Visual Relationship Learning with a Factorization-based Prior
Min-Entropy Latent Model for Weakly Supervised Object Detection
Boundary Flow: A Siamese Network that Predicts Boundary Motion without Training on Motion
SfSNet : Learning Shape, Reflectance and Illuminance of Faces `in the wild'
Facial Expression Recognition by De-expression Residue Learning
Empirical study of the topology and geometry of deep networks
Learning Globally Optimized Object Detector via Policy Gradient
Learning from Synthetic Data: Semantic Segmentation using Generative Adversarial Networks
Recurrent Residual Module for Fast Inference in Videos
Viewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification
Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing
Deep Adversarial Metric Learning
Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision
Art of singular vectors and universal adversarial perturbations
Free supervision from video games
Unifying Identification and Context Learning for Person Recognition
DensePose: Multi-Person Dense Human Pose Estimation In The Wild
End-to-end Convolutional Semantic Embeddings
Convolutional Image Captioning
Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
Nonlinear 3D Face Morphable Model
OATM: Occlusion Aware Template Matching by Consensus Set Maximization
Multi-Image Semantic Matching by Mining Consistent Features
Explicit Loss-Error-Aware Quantization for Deep Neural Networks
Modeling Facial Geometry using Compositional VAEs
Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction
DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion
Attentional ShapeContextNet for Point Cloud Recognition
Weakly Supervised Instance Segmentation using Class Peak Response
Fast and Robust Estimation for Unit-Norm Constrained Linear Fitting Problems
Maximum Classifier Discrepancy for Unsupervised Domain Adaptation
Multi-Level Factorisation Net for Person Re-Identification
Video Based Reconstruction of 3D People Models
Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer
Logo Synthesis and Manipulation with Clustered Generative Adversarial Networks
Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
Image Super-resolution via Dual-state Recurrent Neural Networks
Excitation Backprop for RNNs
Image Generation from Scene Graphs
Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking
Image Restoration by Estimating Frequency Distribution of Local Patches
Learning to Adapt Structured Output Space for Semantic Segmentation
Deep Spatial Feature Reconstruction for Partial Person Re-identification
Tight Nonconvex Relaxation of MAP Inference
Multiple Granularity Group Interaction Prediction
Accurate and Diverse Sampling of Sequences based on a ``Best of Many'' Sample Objective
Learning Rich Features for Image Manipulation Detection
DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Network
A Benchmark for Articulated Human Pose Estimation and Tracking
Preserving Semantic Relations for Zero-Shot Learning
Geometry-Aware Scene Text Detection with Instance Transformation Network
CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
Joint Cuts and Matching of Partitions in One Graph
Fast and Accurate Online Video Object Segmentation via Tracking Parts
Learning Nested Structures in Deep Neural Networks
Practical Block-wise Neural Network Architecture Generation
AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
Modifying Non-Local Variations Across Multiple Views
Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images
Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN
When will you do what? - Anticipating Temporal Occurrences of Activities
Visual Question Answering with Memory-Augmented Networks
Stochastic Variational Inference with Gradient Linearization
Human Pose Estimation with Parsing Induced Learner
3D Registration of Curves and Surfaces using Local Differential Information
Deformation Aware Image Compression
PoseFlow: A Deep Motion Representation for Understanding Human Behaviors in Videos
MovieGraphs: Towards Understanding Human-Centric Situations from Videos
Hybrid Camera Pose Estimation
Fast Monte-Carlo Localization on Aerial Vehicles using Approximate Continuous Belief Representations
PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
Hierarchical Recurrent Attention Networks for Structured Online Maps
Learning Less is More - 6D Camera Localization via 3D Surface Regression
Visual Question Generation as Dual Task of Visual Question Answering
3D Object Detection with Latent Support Surfaces
An Analysis of Scale Invariance in Object Detection - SNIP
3D Semantic Trajectory Reconstruction from 3D Pixel Continuum
KIPPI: KInetic Polygonal Partitioning of Images
COCO-Stuff: Thing and Stuff Classes in Context
Joint Optimization Framework for Learning with Noisy Labels
Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks
Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation
Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
Deep Back-Projection Networks For Super-Resolution
Generating a Fusion Image: One' s Identity and Another's Shape
V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty
Cross-modal Deep Variational Hand Pose Estimation
Learning to Estimate 3D Human Pose and Shape from a Single Color Image
Video Rain Removal By Multiscale Convolutional Sparse Coding
Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning
Learning 3D Shape Completion from Point Clouds with Weak Supervision
SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display
Weakly-supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation
Rolling Shutter and Radial Distortion are Features for High Frame Rate Multi-camera Tracking
Robust Hough Transform Based 3D Reconstruction from Circular Light Fields
Feedback-prop: Convolutional Neural Network Inference under Partial Evidence
Learning Strict Identity Mappings in Deep Residual Networks
Residual Parameter Transfer for Deep Domain Adaptation
Exploring Disentangled Feature Representation Beyond Face Identification
SPLATNet: Sparse Lattice Networks for Point Cloud Processing
Unsupervised Training for 3D Morphable Model Regression
A Bi-directional Message Passing Model for Salient Object Detection
Learning to See in the Dark
Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos
Finding beans in burgers: Deep semantic-visual embedding with localization
Referring Relationships
Adversarially Learned One-Class Classifier for Novelty Detection
Surface Networks
Efficient parametrization of multi-domain deep neural networks
Recognizing Human Actions as Evolution of Pose Estimation Maps
Soccer on Your Tabletop
CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
Gesture Recognition: Focus on the Hands
Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF
Real-world Anomaly Detection in Surveillance Videos
Learning a Single Convolutional Super-Resolution Network for Multiple Degradations
Iterative Visual Reasoning Beyond Convolutions
Guide Me: Interacting with Deep Networks
PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
Future Frame Prediction for Anomaly Detection A New Baseline
Structure Preserving Video Prediction
Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
Captioning Images with Style Transfer from Unaligned Text Corpora
Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation
Illuminant Spectra-based Source Separation Using Flash Photography
3D Human Pose Reconstruction and Action Classification in Robot Assisted Therapy of Children with Autism
Discrete-Continuous ADMM for Transductive Inference in Higher-Order MRFs
Classification Driven Dynamic Image Enhancement
Feature Generating Networks for Zero-Shot Learning
Beyond Trade-off: Accelerate FCN-based Face Detection with Higher Accuracy
MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition
Unsupervised Learning and Segmentation of Complex Activities from Video
Sparse Photometric 3D Face Reconstruction Guided by Morphable Models
LSTM stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)
Inverse Composition Discriminative Optimization for Point Cloud Registration
Inference in Higher Order MRF-MAP Problems with Small and Large Cliques
Look at Boundary: A Boundary-Aware Face Alignment Algorithm
LEGO: Learning Edge with Geometry all at Once by Watching Videos
CosFace: Large Margin Cosine Loss for Deep Face Recognition
Learning Semantic Concepts and Order for Image and Sentence Matching
Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks
Low-shot learning with large-scale diffusion
Multimodal Visual Concept Learning with Weakly Supervised Techniques
Cross-View Image Synthesis using Conditional Generative Adversarial Nets
Pixel-Wise Metric Learning for Blazingly Fast Video Object Segmentation
PieAPP: Perceptual Image-Error Assessment through Pairwise Preference
Cube Padding for Weakly-Supervised Saliency Prediction in 360$^{\circ}$ Videos
CRRN: Multi-Scale Guided Concurrent Reflection Removal Network
Stereoscopic Neural Style Transfer
Low-shot Learning from Imaginary Data
Fast, Simple, and Effective Resource-Constrained Structure Learning of Deep Networks
Unsupervised Sparse Dirichlet-Net for Hyperspectral Image Super-Resolution
Visual Grounding via Accumulated Attention
Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars
Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes
Actor and Action Video Segmentation from a Sentence
AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
CartoonGAN: Generative Adversarial Networks for Photo Cartoonization
RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials
Tracking Multiple Objects Outside the Line of Sight using Speckle Imaging
Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
Densely Connected Pyramid Dehazing Network
Matching Adversarial Networks
Automatic Map Inference from Aerial Images
Polarimetric Dense Monocular SLAM
Learning Attribute Representations with Localization for Flexible Fashion Search
Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
Unsupervised CCA
Analyzing Filters Toward Efficient ConvNet
Good Appearance Features for Multi-Target Multi-Camera Tracking
Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
Efficient Optimization for Rank-based Loss Functions
ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing
A Perceptual Measure for Deep Single Image Camera Calibration
Radially-Distorted Conjugate Translations
Multi-task Learning by Maximizing Statistical Dependence
Creating Capsule Wardrobes from Fashion Images
Towards Human-Machine Cooperation: Evolving Active Learning with Self-supervised Process for Object Detection
Synthesizing Images of Humans in Unseen Poses
Learning to Act Properly: Predicting and Explaining Affordances from Images
Pyramid Stereo Matching Network
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
A General Two-Step Quantization Approach for Low-bit Neural Networks with High Accuracy
GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition
Convolutional Neural Networks with Alternately Updated Clique
Squeeze-and-Excitation Networks
NISP: Pruning Networks using Neuron Importance Score Propagation
Audio to Body Dynamics
ID-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis
Deep Learning of Graph Matching
Neural Baby Talk
Efficient Video Object Segmentation via Network Modulation
Regularizing Deep Networks by Modeling and Predicting Label Structure
Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation
Face Detector Adaptation without Negative Transfer or Catastrophic Forgetting
Motion-Appearance Co-Memory Networks for Video Question Answering
Compare and Contrast: Learning Prominent Visual Differences
Tangent Convolutions for Dense Prediction in 3D
Single-Shot Object Detection with Enriched Semantics
Generating Synthetic X-ray Images of a Person from the Surface Geometry
Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
Edit Probability for Scene Text Recognition
MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment
Texture Mapping for 3D Reconstruction with RGB-D Sensor
Multi-Agent Diverse Generative Adversarial Networks
Towards Universal Representation for Unseen Action Recognition
Zero-Shot Kernel Learning.
DOTA: A Large-scale Dataset for Object Detection in Aerial Images
Multi-Frame Quality Enhancement for Compressed Video
From Lifestyle VLOGs to Everyday Interactions
Occluded Pedestrian Detection through Guided Attention in CNNs
Decoupled Networks
Deep Cocktail Networks: Multi-source Unsupervised Domain Adaptation with Category Shift
Partially Shared Multi-Task Convolutional Neural Network with Local Constraint for Face Attribute Learning
Joint Pose and Expression Modeling for Facial Expression Recognition
Unsupervised Textual Grounding: Linking Words to Image Concepts
Interleaved Structured Sparse Convolutional Neural Networks
Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
Image to Image Translation for Domain Adaptation
A Face to Face Neural Conversation Model
Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification
FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors
SO-Net: Self-Organizing Network for Point Cloud Analysis
MoNet: Moments Embedding Network
Coupled End-to-end Transfer Learning with Generalized Fisher Information
Inferring Light Fields from Shadows
LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
Multi-Level Fusion based 3D Object Detection from Monocular Images
Single-Image Depth Estimation Based on Fourier Domain Analysis
Flow Guided Recurrent Neural Encoder for Video Salient Object Detection
Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes
Seeing Voices and Hearing Faces: Cross-modal biometric matching
Feature Mapping for Learning Fast and Accurate 3D Pose Inference from Synthetic Images
Fast and Accurate Single Image Super-Resolution via Information Distillation Network
Learning and Using the Arrow of Time
Rethinking the Faster R-CNN Architecture for Temporal Action Localization
Deeply Learned Filter Response Functions for Hyperspectral Reconstruction
Fusing Crowd Density Maps and Visual Object Trackers for People Tracking in Crowd Scenes
Intrinsic Image Transformation via Scale Space Decomposition
Deep Ordinal Regression Network for Monocular Depth Estimation
Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
Functional Map of the World
CSGNet: Neural Shape Parser for Constructive Solid Geometry
Instance Embedding Transfer to Unsupervised Video Object Segmentation
Statistical Tomography of Microscopic Life
Point-wise Convolutional Neural Networks
Pixar: Real-time 3D Object Detection from Point Clouds
HydraNets: Specialized Dynamic Architectures for Efficient Inference
Deep Depth Completion of a Single RGB-D Image
Learning to Extract a Video Sequence from a Single Motion-Blurred Image
A Fast Resection-Intersection Method for the Known Rotation Problem
iVQA: Inverse Visual Question Answering
Crowd Counting via Adversarial Cross-Scale Consistency Pursuit
Trust your Model: Light Field Depth Estimation with inline Occlusion Handling
PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition
A Memory Network Approach for Story-based Temporal Summarization of 360° Videos
Tags2Parts: Discovering Semantic Regions from Shape Tags
Jerk-Aware Video Acceleration Magnification
A Robust Method for Strong Rolling Shutter Effects Correction Using Lines with Automatic Feature Selection
Mobile Video Object Detection with Temporally-Aware Feature Maps
VirtualHome: Simulating Household Activities via Programs
MoNet: Deep Motion Exploitation for Video Object Segmentation
Detect globally, refine locally: A novel approach to saliency detection
EPINET: A Fully-Convolutional Neural Network for Light Field Depth Estimation by Using Epipolar Geometry
Learning Face Age Progression: A Pyramid Architecture of GANs
Normalized Cut Loss for Weakly Supervised CNN Segmentation
Reconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves
Dynamic Few-Shot Visual Learning without Forgetting
Camera Style Adaptation for Person Re-identification
In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
Resource Aware Person Re-identification across Multiple Resolutions
Zero-Shot Super-Resolution using Deep Internal Learning
Analysis of Hand Segmentation in the Wild
Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination
Face Aging with Identity-Preserved Conditional Generative Adversarial Networks
Deep Extreme Cut: From Extreme Points to Object Segmentation
Person Re-identification with Cascaded Pairwise Convolutions
Distributable Consistent Multi-Graph Matching
A Twofold Siamese Network for Real-Time Object Tracking
AON: Towards Arbitrarily-Oriented Text Recognition
Deep Cauchy Hashing for Hamming Space Retrieval
Non-blind Deblurring: Handling Kernel Uncertainty with CNNs
Referring Image Segmentation via Recurrent Refinement Networks
Deep Density Clustering of Unconstrained Faces
A Constrained Deep Neural Network for Ordinal Regression

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVPR2018-papers

About

Releases

Packages

kaluo-zZ/CVPR2018-papers

Folders and files

Latest commit

History

Repository files navigation

CVPR2018-papers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages