[논문 리뷰] U-Net: Convolutional Networks for Biomedical Image Segmentation

Notice

« 2025/04 »
일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Tags more

Archives

Today

Total

관리 메뉴

연세대 인공지능학회 YAI

[논문 리뷰] U-Net: Convolutional Networks for Biomedical Image Segmentation 본문

컴퓨터비전 : CV

[논문 리뷰] U-Net: Convolutional Networks for Biomedical Image Segmentation

_YAI_ 2023. 3. 4. 14:51

U-Net: Convolutional Networks for Biomedical Image Segmentation

** YAI 10기 안정우님이 비전논문기초팀에서 작성한 글입니다.

Abstact

Present a network and training strategy that relies on the strong use of data augmentation
- Use available annotated samples more efficiently
The architecture consists of a Contracting path and a symmetric Expanding path
- Contracting path
  - Captures content
- Expanding path
  - Enables precise localization
Can be trained end-to-end from very few images
Outperforms the prior best method
- Prior best method: sliding-window conv net
- ISBI challenge for segmentation of neuronal structures in electron microscopic stacks
- ISBI cell tracking challenge 2015
Fast Inference
- less than a second on a current GPU

Introduction

Deep Convolutional networks have outperformed the state of the art in many visual recognition tasks
Typical use of conv nets: classification tasks
- Output to an image is a single class label

Segmentation
- in biomedical image processing
  - output should include localization
  - a class label is supposed to be assigned to each pixel
Biomedical tasks
- thousands of training images are usually beyond reach
- Ciresan et al.
  - trained a network in a sliding-window setup to predict the class label of each pixel by providing a local region (patch) around that pixel as input
  - localization available, larger number of training images
  - Drawbacks
    - Slow
      - since the network nust be run separately for each patch, a lot of redundancy due to overlapping patches
    - Trade off between localization accuracy and the use of context
      - Large patches -> require more max-pooling layers -> reduces localization accuracy
      - Small patches -> the network sees only little context
Propose new architecture: "U-Net"
- build upon a more elegant architecture, the so-called “fully convolutional network”
- modify, extend this architecture such that it
  - works with very few training images
  - yields more precise segmentations

Contracting path
Expansive path
- Upsampling layers
  - increase the resolution

High resolution features from the contracting path are combined with the upsampled output
- for localization

Upsampling part
- Large number of feature channels -> allows the network to propagate context information to higher resolution layers.
No fully connected layers
- only uses the valid part of each convolution
- the segmentation map only contains the pixels, for which the full context is available in the input image
- allows the seamless segmentation fo arbitrarily large images by an overlap-tile strategy
- Overlap-tile strategy
  - border region -> extrapolated by mirroring

Excessive data augmentation
- allows the network to learn invariance to such deformatioins
- particularly important in biomedical segmentation
  - deformation used to be the most common variation in tissue and realistic deformations can be simulated efficiently

Seperation of touching objects
- Weighted Loss
  - separating background labels between touching cells obtain a large weight in the loss function

SOTA Results

Network Architecture

Contracting path (left side), Expansive path (right side)

Contracting path
- follows the typical architecture of ConvNet
- two 3x3 convolution (unpadded), each followed by ReLU and 2x2 max-pooling for downsampling
- doubled number of feature channels at each downsampling step

Expansive path
- 2x2 up-convolution that halves number of feature channels
- concatenation with correspondingly cropped feature map from the contracting path
- two 3x3 convolution, each follwed by ReLU
- 1x1 convolution used to map each 64-component feature vector to the desired number of classes

total 23 conv layers

Training

large input tile over large batch size -> reduce the batch size to a single image
- to minimize the overhead and make maximum use of GPU memory

high momentum (0.99) -> large number of the previously seen training samples determine the update in the current optimization step.

energy function is computed by a pixel-wise soft-max over the final feature map combined with the cross entropy loss function

cross entropy

Data Augmentation

shift, rotation, grayscale, randoms elastic deformations
Drop-out layers at the end of the contracting path
Experiments

Conclusion

UNet architecture achieves very good performance on very different biomed- ical segmentation applications
Thanks to data augmentation with elastic deformations, it only needs very few annotated images and has a very reasonable training time

Discussions

The warping error is a segmentation metric that tolerates disagreements over boundary location, penalizes topological disagreements, and can be used directly as a cost function for learning boundary detection1.

In other words, instead of focusing on the geometric differences (pixel disagreement) between two segmentations, the warping error focuses on the objects and measures the topological error between them.

Comments

연세대 인공지능학회 YAI 연세대학교 인공지능학회 YAI 블로그 : 9기

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

연세대 인공지능학회 YAI

연세대 인공지능학회 YAI

[논문 리뷰] U-Net: Convolutional Networks for Biomedical Image Segmentation 본문

[논문 리뷰] U-Net: Convolutional Networks for Biomedical Image Segmentation

U-Net: Convolutional Networks for Biomedical Image Segmentation

** YAI 10기 안정우님이 비전논문기초팀에서 작성한 글입니다.

Abstact

Introduction

Network Architecture

Training

Data Augmentation

Experiments

Conclusion

Discussions

티스토리툴바

단축키

내 블로그

블로그 게시글

모든 영역