Cách hiệu quả nhất để xóa các bản sao và sắp xếp một vectơ là gì?

274

Tôi cần lấy một vectơ C ++ có khả năng rất nhiều phần tử, xóa các bản sao và sắp xếp nó.

Tôi hiện có mã dưới đây, nhưng nó không hoạt động.

vec.erase(
      std::unique(vec.begin(), vec.end()),
      vec.end());
std::sort(vec.begin(), vec.end());

Làm thế nào tôi có thể làm điều này một cách chính xác?

Ngoài ra, nó có nhanh hơn để xóa các bản sao đầu tiên (tương tự như được mã hóa ở trên) hoặc thực hiện sắp xếp trước không? Nếu tôi thực hiện sắp xếp trước, nó có được đảm bảo duy trì sắp xếp sau khi std::uniqueđược thực hiện không?

Hoặc có một cách khác (có lẽ hiệu quả hơn) để làm tất cả điều này?

— Kyle Ryan
nguồn

3

Tôi giả sử bạn không có tùy chọn kiểm tra trước khi chèn để tránh bị lừa ở vị trí đầu tiên?

— Joe

Chính xác. Đó sẽ là lý tưởng.

— Kyle Ryan

29

Tôi sẽ đề nghị sửa mã ở trên, hoặc thực sự chỉ ra rằng đó là SAI. std :: unique giả định phạm vi đã được sắp xếp.

— Matthieu M.

583

Tôi đồng ý với R. Pate và Todd Gardner ; một std::setý tưởng tốt ở đây Ngay cả khi bạn bị mắc kẹt khi sử dụng vectơ, nếu bạn có đủ các bản sao, bạn có thể tốt hơn là tạo một bộ để thực hiện công việc bẩn.

Hãy so sánh ba cách tiếp cận:

Chỉ cần sử dụng vector, sort + unique

sort( vec.begin(), vec.end() );
vec.erase( unique( vec.begin(), vec.end() ), vec.end() );

Chuyển đổi thành thiết lập (thủ công)

set<int> s;
unsigned size = vec.size();
for( unsigned i = 0; i < size; ++i ) s.insert( vec[i] );
vec.assign( s.begin(), s.end() );

Chuyển đổi thành thiết lập (sử dụng hàm tạo)

set<int> s( vec.begin(), vec.end() );
vec.assign( s.begin(), s.end() );

Đây là cách chúng thực hiện khi số lượng trùng lặp thay đổi:

so sánh phương pháp vectơ và tập hợp

Tóm tắt : khi số lượng trùng lặp đủ lớn, thực sự nhanh hơn để chuyển đổi thành một tập hợp và sau đó chuyển dữ liệu trở lại thành một vectơ .

Và vì một số lý do, thực hiện chuyển đổi cài đặt theo cách thủ công dường như nhanh hơn so với sử dụng hàm tạo - ít nhất là trên dữ liệu ngẫu nhiên của đồ chơi mà tôi đã sử dụng.

— Nate Kohl
nguồn

61

Tôi bị sốc vì cách tiếp cận của nhà xây dựng luôn tệ hơn so với thủ công. Bạn sẽ làm điều đó ngoài một số chi phí không đổi nhỏ, nó sẽ làm việc thủ công. Bất cứ ai có thể giải thích điều này?

— Ari

17

Thật tuyệt, cảm ơn vì đồ thị. Bạn có thể cho biết ý nghĩa của các đơn vị đối với Số lượng trùng lặp không? (tức là xung quanh mức độ "đủ lớn")?

— Kyle Ryan

5

@Kyle: Nó khá lớn. Tôi đã sử dụng bộ dữ liệu gồm 1.000.000 số nguyên được rút ngẫu nhiên trong khoảng từ 1 đến 1000, 100 và 10 cho biểu đồ này.

— Nate Kohl

5

Tôi nghĩ rằng kết quả của bạn là sai. Trong các thử nghiệm của tôi, các phần tử trùng lặp nhiều hơn, vectơ nhanh hơn (so sánh), thực sự chia tỷ lệ theo cách khác. Bạn đã biên dịch với tối ưu hóa và kiểm tra thời gian chạy chưa?. Về phía tôi, vector luôn nhanh hơn, lên tới 100 lần tùy thuộc vào số lượng trùng lặp. VS2013, cl / Ox -D_SECURE_SCL = 0.

— davidnr

39

Mô tả về trục x dường như bị thiếu.

— BartoszKP

72

Tôi làm lại hồ sơ của Nate Kohl và nhận được kết quả khác nhau. Đối với trường hợp thử nghiệm của tôi, việc sắp xếp trực tiếp vectơ luôn hiệu quả hơn so với sử dụng một tập hợp. Tôi đã thêm một phương pháp mới hiệu quả hơn, sử dụng mộtunordered_set .

Hãy nhớ rằng unordered_setphương thức chỉ hoạt động nếu bạn có hàm băm tốt cho loại bạn cần duy nhất và được sắp xếp. Đối với ints, điều này là dễ dàng! (Thư viện chuẩn cung cấp hàm băm mặc định đơn giản là hàm nhận dạng.) Ngoài ra, đừng quên sắp xếp ở cuối vì unordered_set là, tốt, không có thứ tự :)

Tôi đã thực hiện một số công việc đào sâu bên trong setvà unordered_settriển khai và phát hiện ra rằng hàm tạo thực sự xây dựng một nút mới cho mọi phần tử, trước khi kiểm tra giá trị của nó để xác định xem nó có thực sự được chèn hay không (ít nhất là trong triển khai Visual Studio).

Dưới đây là 5 phương pháp:

F1: Chỉ cần sử dụng vector, sort+unique

sort( vec.begin(), vec.end() );
vec.erase( unique( vec.begin(), vec.end() ), vec.end() );

f2: Chuyển đổi sang set(sử dụng hàm tạo)

set<int> s( vec.begin(), vec.end() );
vec.assign( s.begin(), s.end() );

f3: Chuyển đổi sang set(thủ công)

set<int> s;
for (int i : vec)
    s.insert(i);
vec.assign( s.begin(), s.end() );

f4: Chuyển đổi sang unordered_set(sử dụng hàm tạo)

unordered_set<int> s( vec.begin(), vec.end() );
vec.assign( s.begin(), s.end() );
sort( vec.begin(), vec.end() );

f5: Chuyển đổi sang unordered_set(thủ công)

unordered_set<int> s;
for (int i : vec)
    s.insert(i);
vec.assign( s.begin(), s.end() );
sort( vec.begin(), vec.end() );

Tôi đã thực hiện thử nghiệm với một vectơ 100.000.000 int được chọn ngẫu nhiên trong các phạm vi [1,10], [1,1000] và [1,100000]

Kết quả (tính bằng giây, nhỏ hơn là tốt hơn):

range         f1       f2       f3       f4      f5
[1,10]      1.6821   7.6804   2.8232   6.2634  0.7980
[1,1000]    5.0773  13.3658   8.2235   7.6884  1.9861
[1,100000]  8.7955  32.1148  26.5485  13.3278  3.9822

— alexk7
nguồn

4

Đối với số nguyên, bạn có thể sử dụng sắp xếp radix, nhanh hơn nhiều so với std :: sort.

— Thay đổi CN

2

Mẹo nhanh, để sử dụng sorthoặc uniquephương pháp, bạn phải#include <algorithm>

— Davmrtl

3

@ChangmingSun Tôi tự hỏi tại sao trình tối ưu hóa dường như thất bại trên f4? Các con số khác nhau đáng kể đến f5. Nó không có ý nghĩa gì với tôi.

— Sandthorn

1

@sandthorn Như đã giải thích trong câu trả lời của tôi, việc triển khai xây dựng một nút (bao gồm phân bổ động) cho mỗi phần tử từ chuỗi đầu vào, điều này gây lãng phí cho mọi giá trị cuối cùng là một bản sao. Không có cách nào trình tối ưu hóa có thể biết nó có thể bỏ qua điều đó.

— alexk7

À, điều đó làm tôi nhớ đến một trong những cuộc nói chuyện của Scott Meyer về CWUK scenerio có bản chất của sự phù hợp để làm chậm emplaceloại công trình.

— Sandthorn

57

std::unique chỉ xóa các phần tử trùng lặp nếu chúng là hàng xóm: bạn phải sắp xếp vectơ trước khi nó hoạt động như bạn dự định.

std::unique được định nghĩa là ổn định, vì vậy vectơ vẫn sẽ được sắp xếp sau khi chạy duy nhất trên nó.

— người nhảy
nguồn

42

Tôi không chắc bạn đang sử dụng cái này để làm gì, vì vậy tôi không thể nói điều này với sự chắc chắn 100%, nhưng thông thường khi tôi nghĩ rằng container "được sắp xếp, duy nhất", tôi nghĩ về một bộ std :: . Nó có thể phù hợp hơn với usecase của bạn:

std::set<Foo> foos(vec.begin(), vec.end()); // both sorted & unique already

Mặt khác, sắp xếp trước khi gọi duy nhất (như các câu trả lời khác đã chỉ ra) là cách để đi.

— Todd Gardner
nguồn

Vâng đến điểm! std :: set được chỉ định là một bộ duy nhất được sắp xếp. Hầu hết các triển khai sử dụng một cây nhị phân có hiệu quả hoặc một cái gì đó tương tự.

— notnoop

+1 Suy nghĩ về thiết lập là tốt. Không muốn lặp lại câu trả lời này

— Tom

Là std :: set đảm bảo được sắp xếp? Nó có ý nghĩa rằng trong thực tế nó sẽ được, nhưng tiêu chuẩn yêu cầu nó?

— MadCoder

1

Yup, xem 23.1.4.9 "Thuộc tính cơ bản của các trình vòng lặp của các container kết hợp là chúng lặp qua các container theo thứ tự các khóa không giảm dần được xác định bằng cách so sánh được sử dụng để xây dựng chúng"

— Todd Gardner

1

@MadCoder: Không nhất thiết phải "có ý nghĩa" rằng một bộ được thực hiện theo cách được sắp xếp. Cũng có những bộ được thực hiện bằng cách sử dụng bảng băm, không được sắp xếp. Trong thực tế, hầu hết mọi người thích sử dụng bảng băm khi có sẵn. Nhưng quy ước đặt tên trong C ++ chỉ xảy ra khi các thùng chứa kết hợp được sắp xếp được đặt tên đơn giản là "set" / "map" (tương tự với TreeSet / TreeMap trong Java); và các thùng chứa kết hợp được băm, nằm ngoài tiêu chuẩn, được gọi là "hash_set" / "hash_map" (SGI STL) hoặc "unordered_set" / "unordered_map" (TR1) (tương tự như Hashset và HashMap trong Java)

— newacct

21

std::uniquechỉ hoạt động trên các lần chạy liên tiếp của các phần tử trùng lặp, vì vậy bạn nên sắp xếp trước. Tuy nhiên, nó ổn định, vì vậy vector của bạn sẽ vẫn được sắp xếp.

— David Seiler
nguồn

18

Đây là một mẫu để làm điều đó cho bạn:

template<typename T>
void removeDuplicates(std::vector<T>& vec)
{
    std::sort(vec.begin(), vec.end());
    vec.erase(std::unique(vec.begin(), vec.end()), vec.end());
}

gọi nó như:

removeDuplicates<int>(vectorname);

— DShook
nguồn

2

+1 Templatize đi! - nhưng bạn chỉ có thể viết removeD repeatates (vec) mà không chỉ định rõ ràng các đối số mẫu

— Faisal Vali

10

Hoặc thậm chí tốt hơn, chỉ cần nó lấy các trình vòng lặp templated trực tiếp (bắt đầu và kết thúc), và bạn có thể chạy nó trên các cấu trúc khác bên cạnh một vectơ.

— Kyle Ryan

Địa ngục, các mẫu! sửa chữa nhanh chóng cho danh sách nhỏ, phong cách STL đầy đủ. +1 thx

— QuantumKarl

@Kyle - chỉ trên các bộ chứa khác có erase()phương thức, nếu không, bạn phải trả lại bộ lặp kết thúc mới và có mã cuộc gọi cắt ngắn bộ chứa.

— Toby Speight

8

Hiệu quả là một khái niệm phức tạp. Có các cân nhắc về thời gian và không gian, cũng như các phép đo chung (trong đó bạn chỉ nhận được các câu trả lời mơ hồ như O (n)) so với các câu hỏi cụ thể (ví dụ: sắp xếp bong bóng có thể nhanh hơn nhiều so với quicksort, tùy thuộc vào đặc điểm đầu vào).

Nếu bạn có tương đối ít bản sao, thì sắp xếp theo sau là duy nhất và xóa dường như là cách để đi. Nếu bạn có tương đối nhiều bản sao, việc tạo một tập hợp từ vectơ và để cho nó thực hiện việc nâng vật nặng có thể dễ dàng đánh bại nó.

Đừng chỉ tập trung vào hiệu quả thời gian. Sắp xếp + duy nhất + xóa hoạt động trong không gian O (1), trong khi cấu trúc thiết lập hoạt động trong không gian O (n). Và không trực tiếp cho vay để song song hóa bản đồ (cho các bộ dữ liệu thực sự lớn ).

Điều gì sẽ cung cấp cho bạn bản đồ / giảm khả năng? Điều duy nhất tôi có thể nghĩ đến là một loại hợp nhất phân tán và bạn vẫn chỉ có thể sử dụng một luồng trong hợp nhất cuối cùng.

— Zan Lynx

1

Có, bạn phải có một nút / luồng kiểm soát. Tuy nhiên, bạn có thể chia vấn đề nhiều lần theo yêu cầu để đặt giới hạn trên cho số lượng luồng công nhân / con mà luồng xử lý / cha mẹ xử lý và trên kích thước của tập dữ liệu mà mỗi nút lá phải xử lý. Không phải tất cả các vấn đề đều có thể giải quyết dễ dàng với việc giảm bản đồ, tôi chỉ đơn giản muốn chỉ ra rằng có những người xử lý các vấn đề tối ưu hóa tương tự (trên bề mặt, dù sao), trong đó xử lý 10 terabyte dữ liệu được gọi là "Thứ ba".

7

Bạn cần sắp xếp nó trước khi bạn gọi uniquevì uniquechỉ loại bỏ các bản sao nằm cạnh nhau.

chỉnh sửa: 38 giây ...

— David Johnstone
nguồn

7

uniquechỉ loại bỏ các phần tử trùng lặp liên tiếp (cần thiết để nó chạy trong thời gian tuyến tính), vì vậy bạn nên thực hiện sắp xếp trước. Nó sẽ vẫn được sắp xếp sau khi gọi đến unique.

— Peter
nguồn

7

Nếu bạn không muốn thay đổi thứ tự các yếu tố, thì bạn có thể thử giải pháp này:

template <class T>
void RemoveDuplicatesInVector(std::vector<T> & vec)
{
    set<T> values;
    vec.erase(std::remove_if(vec.begin(), vec.end(), [&](const T & value) { return !values.insert(value).second; }), vec.end());
}

— yury
nguồn

Có lẽ sử dụng unordered_set thay vì set (và boost :: remove_erase_if nếu có)

— gast128

4

Giả sử rằng a là một vectơ, loại bỏ các trùng lặp liền kề bằng cách sử dụng

a.erase(unique(a.begin(),a.end()),a.end());chạy trong thời gian O (n) .

— KPMG
nguồn

1

trùng lặp tiếp giáp. ok, vì vậy nó cần một std::sortđầu tiên.

— v.oddou

2

Như đã nêu, uniqueyêu cầu một container được sắp xếp. Ngoài ra, uniquekhông thực sự loại bỏ các yếu tố từ container. Thay vào đó, chúng được sao chép đến cuối, uniquetrả về một trình vòng lặp trỏ đến phần tử trùng lặp đầu tiên như vậy và bạn sẽ phải gọi eraseđể thực sự loại bỏ các phần tử.

— Max Lybbert
nguồn

Có phải duy nhất yêu cầu một bộ chứa được sắp xếp, hoặc nó chỉ đơn giản là sắp xếp lại chuỗi đầu vào để nó không chứa các bản sao liền kề? Tôi nghĩ sau.

@Pate, bạn đã đúng. Nó không yêu cầu một. Nó loại bỏ các bản sao liền kề.

— Bill Lynch

Nếu bạn có một bộ chứa có thể có các bản sao và bạn muốn một bộ chứa không có bất kỳ giá trị trùng lặp nào ở bất kỳ nơi nào trong bộ chứa thì trước tiên bạn phải sắp xếp bộ chứa, sau đó chuyển nó thành duy nhất, sau đó sử dụng xóa để thực sự loại bỏ các bản sao . Nếu bạn chỉ muốn loại bỏ các bản sao liền kề, thì bạn sẽ không phải sắp xếp vùng chứa. Nhưng bạn sẽ kết thúc với các giá trị trùng lặp: 1 2 2 3 2 4 2 5 2 sẽ được thay đổi thành 1 2 3 2 4 2 5 2 nếu được chuyển thành duy nhất mà không sắp xếp, 1 2 3 4 5 nếu được sắp xếp, chuyển sang duy nhất và xóa .

— Max Lybbert

2

Cách tiếp cận tiêu chuẩn được đề xuất bởi Nate Kohl, chỉ cần sử dụng vector, sort + unique:

sort( vec.begin(), vec.end() );
vec.erase( unique( vec.begin(), vec.end() ), vec.end() );

không làm việc cho một vectơ con trỏ.

Hãy xem xét kỹ ví dụ này trên cplusplus.com .

Trong ví dụ của họ, "cái được gọi là trùng lặp" được di chuyển đến cuối thực sự được hiển thị là? (các giá trị không xác định), bởi vì các "phần tử được gọi là trùng lặp" là SOMETIMES "các phần tử bổ sung" và SOMETIMES có "các phần tử bị thiếu" nằm trong vectơ gốc.

Xảy ra sự cố khi sử dụng std::unique() trên vectơ con trỏ tới các đối tượng (rò rỉ bộ nhớ, đọc dữ liệu xấu từ HEAP, giải phóng trùng lặp, gây ra lỗi phân đoạn, v.v.).

Đây là giải pháp của tôi cho vấn đề: thay thế std::unique()bằngptgi::unique() .

Xem tập tin ptgi_unique.hpp bên dưới:

// ptgi::unique()
//
// Fix a problem in std::unique(), such that none of the original elts in the collection are lost or duplicate.
// ptgi::unique() has the same interface as std::unique()
//
// There is the 2 argument version which calls the default operator== to compare elements.
//
// There is the 3 argument version, which you can pass a user defined functor for specialized comparison.
//
// ptgi::unique() is an improved version of std::unique() which doesn't looose any of the original data
// in the collection, nor does it create duplicates.
//
// After ptgi::unique(), every old element in the original collection is still present in the re-ordered collection,
// except that duplicates have been moved to a contiguous range [dupPosition, last) at the end.
//
// Thus on output:
//  [begin, dupPosition) range are unique elements.
//  [dupPosition, last) range are duplicates which can be removed.
// where:
//  [] means inclusive, and
//  () means exclusive.
//
// In the original std::unique() non-duplicates at end are moved downward toward beginning.
// In the improved ptgi:unique(), non-duplicates at end are swapped with duplicates near beginning.
//
// In addition if you have a collection of ptrs to objects, the regular std::unique() will loose memory,
// and can possibly delete the same pointer multiple times (leading to SEGMENTATION VIOLATION on Linux machines)
// but ptgi::unique() won't.  Use valgrind(1) to find such memory leak problems!!!
//
// NOTE: IF you have a vector of pointers, that is, std::vector<Object*>, then upon return from ptgi::unique()
// you would normally do the following to get rid of the duplicate objects in the HEAP:
//
//  // delete objects from HEAP
//  std::vector<Object*> objects;
//  for (iter = dupPosition; iter != objects.end(); ++iter)
//  {
//      delete (*iter);
//  }
//
//  // shrink the vector. But Object * pointers are NOT followed for duplicate deletes, this shrinks the vector.size())
//  objects.erase(dupPosition, objects.end));
//
// NOTE: But if you have a vector of objects, that is: std::vector<Object>, then upon return from ptgi::unique(), it
// suffices to just call vector:erase(, as erase will automatically call delete on each object in the
// [dupPosition, end) range for you:
//
//  std::vector<Object> objects;
//  objects.erase(dupPosition, last);
//
//==========================================================================================================
// Example of differences between std::unique() vs ptgi::unique().
//
//  Given:
//      int data[] = {10, 11, 21};
//
//  Given this functor: ArrayOfIntegersEqualByTen:
//      A functor which compares two integers a[i] and a[j] in an int a[] array, after division by 10:
//  
//  // given an int data[] array, remove consecutive duplicates from it.
//  // functor used for std::unique (BUGGY) or ptgi::unique(IMPROVED)
//
//  // Two numbers equal if, when divided by 10 (integer division), the quotients are the same.
//  // Hence 50..59 are equal, 60..69 are equal, etc.
//  struct ArrayOfIntegersEqualByTen: public std::equal_to<int>
//  {
//      bool operator() (const int& arg1, const int& arg2) const
//      {
//          return ((arg1/10) == (arg2/10));
//      }
//  };
//  
//  Now, if we call (problematic) std::unique( data, data+3, ArrayOfIntegersEqualByTen() );
//  
//  TEST1: BEFORE UNIQ: 10,11,21
//  TEST1: AFTER UNIQ: 10,21,21
//  DUP_INX=2
//  
//      PROBLEM: 11 is lost, and extra 21 has been added.
//  
//  More complicated example:
//  
//  TEST2: BEFORE UNIQ: 10,20,21,22,30,31,23,24,11
//  TEST2: AFTER UNIQ: 10,20,30,23,11,31,23,24,11
//  DUP_INX=5
//  
//      Problem: 21 and 22 are deleted.
//      Problem: 11 and 23 are duplicated.
//  
//  
//  NOW if ptgi::unique is called instead of std::unique, both problems go away:
//  
//  DEBUG: TEST1: NEW_WAY=1
//  TEST1: BEFORE UNIQ: 10,11,21
//  TEST1: AFTER UNIQ: 10,21,11
//  DUP_INX=2
//  
//  DEBUG: TEST2: NEW_WAY=1
//  TEST2: BEFORE UNIQ: 10,20,21,22,30,31,23,24,11
//  TEST2: AFTER UNIQ: 10,20,30,23,11,31,22,24,21
//  DUP_INX=5
//
//  @SEE: look at the "case study" below to understand which the last "AFTER UNIQ" results with that order:
//  TEST2: AFTER UNIQ: 10,20,30,23,11,31,22,24,21
//
//==========================================================================================================
// Case Study: how ptgi::unique() works:
//  Remember we "remove adjacent duplicates".
//  In this example, the input is NOT fully sorted when ptgi:unique() is called.
//
//  I put | separatators, BEFORE UNIQ to illustrate this
//  10  | 20,21,22 |  30,31 |  23,24 | 11
//
//  In example above, 20, 21, 22 are "same" since dividing by 10 gives 2 quotient.
//  And 30,31 are "same", since /10 quotient is 3.
//  And 23, 24 are same, since /10 quotient is 2.
//  And 11 is "group of one" by itself.
//  So there are 5 groups, but the 4th group (23, 24) happens to be equal to group 2 (20, 21, 22)
//  So there are 5 groups, and the 5th group (11) is equal to group 1 (10)
//
//  R = result
//  F = first
//
//  10, 20, 21, 22, 30, 31, 23, 24, 11
//  R    F
//
//  10 is result, and first points to 20, and R != F (10 != 20) so bump R:
//       R
//       F
//
//  Now we hits the "optimized out swap logic".
//  (avoid swap because R == F)
//
//  // now bump F until R != F (integer division by 10)
//  10, 20, 21, 22, 30, 31, 23, 24, 11
//       R   F              // 20 == 21 in 10x
//       R       F              // 20 == 22 in 10x
//       R           F          // 20 != 30, so we do a swap of ++R and F
//  (Now first hits 21, 22, then finally 30, which is different than R, so we swap bump R to 21 and swap with  30)
//  10, 20, 30, 22, 21, 31, 23, 24, 11  // after R & F swap (21 and 30)
//           R       F 
//
//  10, 20, 30, 22, 21, 31, 23, 24, 11
//           R          F           // bump F to 31, but R and F are same (30 vs 31)
//           R               F      // bump F to 23, R != F, so swap ++R with F
//  10, 20, 30, 22, 21, 31, 23, 24, 11
//                  R           F       // bump R to 22
//  10, 20, 30, 23, 21, 31, 22, 24, 11  // after the R & F swap (22 & 23 swap)
//                  R            F      // will swap 22 and 23
//                  R                F      // bump F to 24, but R and F are same in 10x
//                  R                    F  // bump F, R != F, so swap ++R  with F
//                      R                F  // R and F are diff, so swap ++R  with F (21 and 11)
//  10, 20, 30, 23, 11, 31, 22, 24, 21
//                      R                F  // aftter swap of old 21 and 11
//                      R                  F    // F now at last(), so loop terminates
//                          R               F   // bump R by 1 to point to dupPostion (first duplicate in range)
//
//  return R which now points to 31
//==========================================================================================================
// NOTES:
// 1) the #ifdef IMPROVED_STD_UNIQUE_ALGORITHM documents how we have modified the original std::unique().
// 2) I've heavily unit tested this code, including using valgrind(1), and it is *believed* to be 100% defect-free.
//
//==========================================================================================================
// History:
//  130201  dpb dbednar@ptgi.com created
//==========================================================================================================

#ifndef PTGI_UNIQUE_HPP
#define PTGI_UNIQUE_HPP

// Created to solve memory leak problems when calling std::unique() on a vector<Route*>.
// Memory leaks discovered with valgrind and unitTesting.


#include <algorithm>        // std::swap

// instead of std::myUnique, call this instead, where arg3 is a function ptr
//
// like std::unique, it puts the dups at the end, but it uses swapping to preserve original
// vector contents, to avoid memory leaks and duplicate pointers in vector<Object*>.

#ifdef IMPROVED_STD_UNIQUE_ALGORITHM
#error the #ifdef for IMPROVED_STD_UNIQUE_ALGORITHM was defined previously.. Something is wrong.
#endif

#undef IMPROVED_STD_UNIQUE_ALGORITHM
#define IMPROVED_STD_UNIQUE_ALGORITHM

// similar to std::unique, except that this version swaps elements, to avoid
// memory leaks, when vector contains pointers.
//
// Normally the input is sorted.
// Normal std::unique:
// 10 20 20 20 30   30 20 20 10
// a  b  c  d  e    f  g  h  i
//
// 10 20 30 20 10 | 30 20 20 10
// a  b  e  g  i    f  g  h  i
//
// Now GONE: c, d.
// Now DUPS: g, i.
// This causes memory leaks and segmenation faults due to duplicate deletes of same pointer!


namespace ptgi {

// Return the position of the first in range of duplicates moved to end of vector.
//
// uses operator==  of class for comparison
//
// @param [first, last) is a range to find duplicates within.
//
// @return the dupPosition position, such that [dupPosition, end) are contiguous
// duplicate elements.
// IF all items are unique, then it would return last.
//
template <class ForwardIterator>
ForwardIterator unique( ForwardIterator first, ForwardIterator last)
{
    // compare iterators, not values
    if (first == last)
        return last;

    // remember the current item that we are looking at for uniqueness
    ForwardIterator result = first;

    // result is slow ptr where to store next unique item
    // first is  fast ptr which is looking at all elts

    // the first iterator moves over all elements [begin+1, end).
    // while the current item (result) is the same as all elts
    // to the right, (first) keeps going, until you find a different
    // element pointed to by *first.  At that time, we swap them.

    while (++first != last)
    {
        if (!(*result == *first))
        {
#ifdef IMPROVED_STD_UNIQUE_ALGORITHM
            // inc result, then swap *result and *first

//          THIS IS WHAT WE WANT TO DO.
//          BUT THIS COULD SWAP AN ELEMENT WITH ITSELF, UNCECESSARILY!!!
//          std::swap( *first, *(++result));

            // BUT avoid swapping with itself when both iterators are the same
            ++result;
            if (result != first)
                std::swap( *first, *result);
#else
            // original code found in std::unique()
            // copies unique down
            *(++result) = *first;
#endif
        }
    }

    return ++result;
}

template <class ForwardIterator, class BinaryPredicate>
ForwardIterator unique( ForwardIterator first, ForwardIterator last, BinaryPredicate pred)
{
    if (first == last)
        return last;

    // remember the current item that we are looking at for uniqueness
    ForwardIterator result = first;

    while (++first != last)
    {
        if (!pred(*result,*first))
        {
#ifdef IMPROVED_STD_UNIQUE_ALGORITHM
            // inc result, then swap *result and *first

//          THIS COULD SWAP WITH ITSELF UNCECESSARILY
//          std::swap( *first, *(++result));
//
            // BUT avoid swapping with itself when both iterators are the same
            ++result;
            if (result != first)
                std::swap( *first, *result);

#else
            // original code found in std::unique()
            // copies unique down
            // causes memory leaks, and duplicate ptrs
            // and uncessarily moves in place!
            *(++result) = *first;
#endif
        }
    }

    return ++result;
}

// from now on, the #define is no longer needed, so get rid of it
#undef IMPROVED_STD_UNIQUE_ALGORITHM

} // end ptgi:: namespace

#endif

Và đây là chương trình UNIT Test mà tôi đã sử dụng để kiểm tra nó:

// QUESTION: in test2, I had trouble getting one line to compile,which was caused  by the declaration of operator()
// in the equal_to Predicate.  I'm not sure how to correctly resolve that issue.
// Look for //OUT lines
//
// Make sure that NOTES in ptgi_unique.hpp are correct, in how we should "cleanup" duplicates
// from both a vector<Integer> (test1()) and vector<Integer*> (test2).
// Run this with valgrind(1).
//
// In test2(), IF we use the call to std::unique(), we get this problem:
//
//  [dbednar@ipeng8 TestSortRoutes]$ ./Main7
//  TEST2: ORIG nums before UNIQUE: 10, 20, 21, 22, 30, 31, 23, 24, 11
//  TEST2: modified nums AFTER UNIQUE: 10, 20, 30, 23, 11, 31, 23, 24, 11
//  INFO: dupInx=5
//  TEST2: uniq = 10
//  TEST2: uniq = 20
//  TEST2: uniq = 30
//  TEST2: uniq = 33427744
//  TEST2: uniq = 33427808
//  Segmentation fault (core dumped)
//
// And if we run valgrind we seen various error about "read errors", "mismatched free", "definitely lost", etc.
//
//  valgrind --leak-check=full ./Main7
//  ==359== Memcheck, a memory error detector
//  ==359== Command: ./Main7
//  ==359== Invalid read of size 4
//  ==359== Invalid free() / delete / delete[]
//  ==359== HEAP SUMMARY:
//  ==359==     in use at exit: 8 bytes in 2 blocks
//  ==359== LEAK SUMMARY:
//  ==359==    definitely lost: 8 bytes in 2 blocks
// But once we replace the call in test2() to use ptgi::unique(), all valgrind() error messages disappear.
//
// 130212   dpb dbednar@ptgi.com created
// =========================================================================================================

#include <iostream> // std::cout, std::cerr
#include <string>
#include <vector>   // std::vector
#include <sstream>  // std::ostringstream
#include <algorithm>    // std::unique()
#include <functional>   // std::equal_to(), std::binary_function()
#include <cassert>  // assert() MACRO

#include "ptgi_unique.hpp"  // ptgi::unique()



// Integer is small "wrapper class" around a primitive int.
// There is no SETTER, so Integer's are IMMUTABLE, just like in JAVA.

class Integer
{
private:
    int num;
public:

    // default CTOR: "Integer zero;"
    // COMPRENSIVE CTOR:  "Integer five(5);"
    Integer( int num = 0 ) :
        num(num)
    {
    }

    // COPY CTOR
    Integer( const Integer& rhs) :
        num(rhs.num)
    {
    }

    // assignment, operator=, needs nothing special... since all data members are primitives

    // GETTER for 'num' data member
    // GETTER' are *always* const
    int getNum() const
    {
        return num;
    }   

    // NO SETTER, because IMMUTABLE (similar to Java's Integer class)

    // @return "num"
    // NB: toString() should *always* be a const method
    //
    // NOTE: it is probably more efficient to call getNum() intead
    // of toString() when printing a number:
    //
    // BETTER to do this:
    //  Integer five(5);
    //  std::cout << five.getNum() << "\n"
    // than this:
    //  std::cout << five.toString() << "\n"

    std::string toString() const
    {
        std::ostringstream oss;
        oss << num;
        return oss.str();
    }
};

// convenience typedef's for iterating over std::vector<Integer>
typedef std::vector<Integer>::iterator      IntegerVectorIterator;
typedef std::vector<Integer>::const_iterator    ConstIntegerVectorIterator;

// convenience typedef's for iterating over std::vector<Integer*>
typedef std::vector<Integer*>::iterator     IntegerStarVectorIterator;
typedef std::vector<Integer*>::const_iterator   ConstIntegerStarVectorIterator;

// functor used for std::unique or ptgi::unique() on a std::vector<Integer>
// Two numbers equal if, when divided by 10 (integer division), the quotients are the same.
// Hence 50..59 are equal, 60..69 are equal, etc.
struct IntegerEqualByTen: public std::equal_to<Integer>
{
    bool operator() (const Integer& arg1, const Integer& arg2) const
    {
        return ((arg1.getNum()/10) == (arg2.getNum()/10));
    }
};

// functor used for std::unique or ptgi::unique on a std::vector<Integer*>
// Two numbers equal if, when divided by 10 (integer division), the quotients are the same.
// Hence 50..59 are equal, 60..69 are equal, etc.
struct IntegerEqualByTenPointer: public std::equal_to<Integer*>
{
    // NB: the Integer*& looks funny to me!
    // TECHNICAL PROBLEM ELSEWHERE so had to remove the & from *&
//OUT   bool operator() (const Integer*& arg1, const Integer*& arg2) const
//
    bool operator() (const Integer* arg1, const Integer* arg2) const
    {
        return ((arg1->getNum()/10) == (arg2->getNum()/10));
    }
};

void test1();
void test2();
void printIntegerStarVector( const std::string& msg, const std::vector<Integer*>& nums );

int main()
{
    test1();
    test2();
    return 0;
}

// test1() uses a vector<Object> (namely vector<Integer>), so there is no problem with memory loss
void test1()
{
    int data[] = { 10, 20, 21, 22, 30, 31, 23, 24, 11};

    // turn C array into C++ vector
    std::vector<Integer> nums(data, data+9);

    // arg3 is a functor
    IntegerVectorIterator dupPosition = ptgi::unique( nums.begin(), nums.end(), IntegerEqualByTen() );

    nums.erase(dupPosition, nums.end());

    nums.erase(nums.begin(), dupPosition);
}

//==================================================================================
// test2() uses a vector<Integer*>, so after ptgi:unique(), we have to be careful in
// how we eliminate the duplicate Integer objects stored in the heap.
//==================================================================================
void test2()
{
    int data[] = { 10, 20, 21, 22, 30, 31, 23, 24, 11};

    // turn C array into C++ vector of Integer* pointers
    std::vector<Integer*> nums;

    // put data[] integers into equivalent Integer* objects in HEAP
    for (int inx = 0; inx < 9; ++inx)
    {
        nums.push_back( new Integer(data[inx]) );
    }

    // print the vector<Integer*> to stdout
    printIntegerStarVector( "TEST2: ORIG nums before UNIQUE", nums );

    // arg3 is a functor
#if 1
    // corrected version which fixes SEGMENTATION FAULT and all memory leaks reported by valgrind(1)
    // I THINK we want to use new C++11 cbegin() and cend(),since the equal_to predicate is passed "Integer *&"

//  DID NOT COMPILE
//OUT   IntegerStarVectorIterator dupPosition = ptgi::unique( const_cast<ConstIntegerStarVectorIterator>(nums.begin()), const_cast<ConstIntegerStarVectorIterator>(nums.end()), IntegerEqualByTenPointer() );

    // DID NOT COMPILE when equal_to predicate declared "Integer*& arg1, Integer*&  arg2"
//OUT   IntegerStarVectorIterator dupPosition = ptgi::unique( const_cast<nums::const_iterator>(nums.begin()), const_cast<nums::const_iterator>(nums.end()), IntegerEqualByTenPointer() );


    // okay when equal_to predicate declared "Integer* arg1, Integer*  arg2"
    IntegerStarVectorIterator dupPosition = ptgi::unique(nums.begin(), nums.end(), IntegerEqualByTenPointer() );
#else
    // BUGGY version that causes SEGMENTATION FAULT and valgrind(1) errors
    IntegerStarVectorIterator dupPosition = std::unique( nums.begin(), nums.end(), IntegerEqualByTenPointer() );
#endif

    printIntegerStarVector( "TEST2: modified nums AFTER UNIQUE", nums );
    int dupInx = dupPosition - nums.begin();
    std::cout << "INFO: dupInx=" << dupInx <<"\n";

    // delete the dup Integer* objects in the [dupPosition, end] range
    for (IntegerStarVectorIterator iter = dupPosition; iter != nums.end(); ++iter)
    {
        delete (*iter);
    }

    // shrink the vector
    // NB: the Integer* ptrs are NOT followed by vector::erase()
    nums.erase(dupPosition, nums.end());


    // print the uniques, by following the iter to the Integer* pointer
    for (IntegerStarVectorIterator iter = nums.begin(); iter != nums.end();  ++iter)
    {
        std::cout << "TEST2: uniq = " << (*iter)->getNum() << "\n";
    }

    // remove the unique objects from heap
    for (IntegerStarVectorIterator iter = nums.begin(); iter != nums.end();  ++iter)
    {
        delete (*iter);
    }

    // shrink the vector
    nums.erase(nums.begin(), nums.end());

    // the vector should now be completely empty
    assert( nums.size() == 0);
}

//@ print to stdout the string: "info_msg: num1, num2, .... numN\n"
void printIntegerStarVector( const std::string& msg, const std::vector<Integer*>& nums )
{
    std::cout << msg << ": ";
    int inx = 0;
    ConstIntegerStarVectorIterator  iter;

    // use const iterator and const range!
    // NB: cbegin() and cend() not supported until LATER (c++11)
    for (iter = nums.begin(), inx = 0; iter != nums.end(); ++iter, ++inx)
    {
        // output a comma seperator *AFTER* first
        if (inx > 0)
            std::cout << ", ";

        // call Integer::toString()
        std::cout << (*iter)->getNum();     // send int to stdout
//      std::cout << (*iter)->toString();   // also works, but is probably slower

    }

    // in conclusion, add newline
    std::cout << "\n";
}

— joe
nguồn

Tôi không hiểu lý do ở đây. Vì vậy, nếu bạn có một thùng chứa các con trỏ và bạn muốn loại bỏ các bản sao, điều đó ảnh hưởng đến các đối tượng được trỏ bởi các con trỏ như thế nào? Không có rò rỉ bộ nhớ sẽ xảy ra bởi vì có ít nhất một con trỏ (và chính xác là một trong con trỏ này) trỏ đến chúng. Chà, tôi đoán rằng phương pháp của bạn có thể có một số giá trị với một số toán tử quá tải lạ hoặc các hàm so sánh lạ đòi hỏi phải xem xét đặc biệt.

— kccqzy

Không chắc chắn nếu tôi hiểu quan điểm của bạn. Lấy một trường hợp đơn giản của một vectơ <int *>, trong đó 4 con trỏ trỏ tới các số nguyên {1, 2. 2, 3}. Nó được sắp xếp, nhưng sau khi bạn gọi std :: unique, 4 con trỏ là con trỏ tới số nguyên {1, 2, 3, 3}. Bây giờ bạn có hai con trỏ giống hệt nhau đến 3, vì vậy nếu bạn gọi xóa, nó sẽ xóa trùng lặp. XẤU! Thứ hai, lưu ý rằng 2 thứ 2 bị lỗi, rò rỉ bộ nhớ.

— joe

kccqzy, đây là chương trình ví dụ để bạn hiểu rõ hơn câu trả lời của tôi:

— joe

@joe: Ngay cả khi sau khi std::uniquebạn có [1, 2, 3, 2], bạn không thể gọi xóa trên 2 vì điều đó sẽ để lại một con trỏ lơ lửng thành 2! => Đơn giản là đừng gọi xóa trên các phần tử giữa newEnd = std::uniquevà std::endvì bạn vẫn có con trỏ tới các phần tử này [std::begin, newEnd)!

— MFH

2

@ArneVogel: Có lẽ đối với các giá trị tầm thường của "hoạt động tốt". Thật vô nghĩa khi gọi uniquemột vector<unique_ptr<T>>, vì giá trị trùng lặp duy nhất mà vectơ có thể chứa là nullptr.

— Ben Voigt

2

Với thư viện Ranges (sắp có trong C ++ 20), bạn chỉ cần sử dụng

action::unique(vec);

Lưu ý rằng nó thực sự loại bỏ các yếu tố trùng lặp, không chỉ di chuyển chúng.

— LF
nguồn

1

Về điểm chuẩn alexK7. Tôi đã thử chúng và nhận được kết quả tương tự, nhưng khi phạm vi giá trị là 1 triệu trường hợp sử dụng std :: sort (f1) và sử dụng std :: unordered_set (f5) tạo ra thời gian tương tự. Khi phạm vi của các giá trị là 10 triệu f1 thì nhanh hơn f5.

Nếu phạm vi của các giá trị bị giới hạn và các giá trị không dấu int, có thể sử dụng std :: vector, kích thước tương ứng với phạm vi đã cho. Đây là mã:

void DeleteDuplicates_vector_bool(std::vector<unsigned>& v, unsigned range_size)
{
    std::vector<bool> v1(range_size);
    for (auto& x: v)
    {
       v1[x] = true;    
    }
    v.clear();

    unsigned count = 0;
    for (auto& x: v1)
    {
        if (x)
        {
            v.push_back(count);
        }
        ++count;
    }
}

— Mikhail Semenov
nguồn

1

sort (v.begin (), v.end ()), v.erase (unique (v.begin (), v, end ()), v.end ());

— Yohanna
nguồn

1

Nếu bạn đang tìm kiếm hiệu suất và sử dụng std::vector, tôi khuyên bạn nên sử dụng liên kết tài liệu này.

std::vector<int> myvector{10,20,20,20,30,30,20,20,10};             // 10 20 20 20 30 30 20 20 10
std::sort(myvector.begin(), myvector.end() );
const auto& it = std::unique (myvector.begin(), myvector.end());   // 10 20 30 ?  ?  ?  ?  ?  ?
                                                                   //          ^
myvector.resize( std::distance(myvector.begin(),it) ); // 10 20 30

— Gines Hidalgo
nguồn

cplusplus.com không phải là tài liệu chính thức.

— Ilya Popov

0

std::set<int> s;
std::for_each(v.cbegin(), v.cend(), [&s](int val){s.insert(val);});
v.clear();
std::copy(s.cbegin(), s.cend(), v.cbegin());

— Wes
nguồn

1

có lẽ thay đổi kích thước vectơ sau khi xóa nó để chỉ có 1 cấp phát bộ nhớ khi xây dựng vectơ. Có thể thích std :: move thay vì std :: copy để di chuyển ints vào vector thay vì sao chép chúng vì bộ này sẽ không cần thiết sau này.

— YoungJohn

0

Nếu bạn không muốn sửa đổi vectơ (xóa, sắp xếp) thì bạn có thể sử dụng thư viện Newton , Trong thư viện con thuật toán có một lệnh gọi hàm, copy_single

template <class INPUT_ITERATOR, typename T>
    void copy_single( INPUT_ITERATOR first, INPUT_ITERATOR last, std::vector<T> &v )

vì vậy bạn có thể:

std::vector<TYPE> copy; // empty vector
newton::copy_single(first, last, copy);

Ở đâu bản sao là vectơ trong đó bạn muốn đẩy lùi bản sao của các phần tử duy nhất. nhưng hãy nhớ bạn đẩy lùi các yếu tố và bạn không tạo ra một vectơ mới

dù sao, điều này nhanh hơn vì bạn không xóa () các phần tử (mất rất nhiều thời gian, ngoại trừ khi bạn pop_back (), vì gán lại)

Tôi thực hiện một số thí nghiệm và nó nhanh hơn.

Ngoài ra, bạn có thể sử dụng:

std::vector<TYPE> copy; // empty vector
newton::copy_single(first, last, copy);
original = copy;

đôi khi vẫn nhanh hơn

— Moise Rojo
nguồn

1

Chức năng này có mặt trong thư viện tiêu chuẩn như unique_copy.

— LF

0

Mã dễ hiểu hơn từ: https://en.cppreference.com/w/cpp/alerskym/unique

#include <iostream>
#include <algorithm>
#include <vector>
#include <string>
#include <cctype>

int main() 
{
    // remove duplicate elements
    std::vector<int> v{1,2,3,1,2,3,3,4,5,4,5,6,7};
    std::sort(v.begin(), v.end()); // 1 1 2 2 3 3 3 4 4 5 5 6 7 
    auto last = std::unique(v.begin(), v.end());
    // v now holds {1 2 3 4 5 6 7 x x x x x x}, where 'x' is indeterminate
    v.erase(last, v.end()); 
    for (int i : v)
      std::cout << i << " ";
    std::cout << "\n";
}

thông báo:

1 2 3 4 5 6 7

— Jayhello
nguồn

0

void removeDuplicates(std::vector<int>& arr) {
    for (int i = 0; i < arr.size(); i++)
    {
        for (int j = i + 1; j < arr.size(); j++)
        {
            if (arr[i] > arr[j])
            {
                int temp = arr[i];
                arr[i] = arr[j];
                arr[j] = temp;
            }
        }
    }
    std::vector<int> y;
    int x = arr[0];
    int i = 0;
    while (i < arr.size())
    {
        if (x != arr[i])
        {
            y.push_back(x);
            x = arr[i];
        }
        i++;
        if (i == arr.size())
            y.push_back(arr[i - 1]);
    }
    arr = y;
}

— robertlucian13
nguồn

2

Chào mừng bạn đến với StackOverflow! Vui lòng chỉnh sửa câu hỏi của bạn để thêm giải thích về cách mã của bạn hoạt động và tại sao nó tương đương hoặc tốt hơn các câu trả lời khác. Câu hỏi này đã hơn mười năm tuổi và đã có nhiều câu trả lời hay, được giải thích rõ ràng. Nếu không có lời giải thích về bạn, nó không hữu ích và có khả năng bị hạ cấp hoặc loại bỏ.

— Das_Geek

-1

Đây là ví dụ về vấn đề xóa trùng lặp xảy ra với std :: unique (). Trên máy LINUX, chương trình gặp sự cố. Đọc các bình luận để biết chi tiết.

// Main10.cpp
//
// Illustration of duplicate delete and memory leak in a vector<int*> after calling std::unique.
// On a LINUX machine, it crashes the progam because of the duplicate delete.
//
// INPUT : {1, 2, 2, 3}
// OUTPUT: {1, 2, 3, 3}
//
// The two 3's are actually pointers to the same 3 integer in the HEAP, which is BAD
// because if you delete both int* pointers, you are deleting the same memory
// location twice.
//
//
// Never mind the fact that we ignore the "dupPosition" returned by std::unique(),
// but in any sensible program that "cleans up after istelf" you want to call deletex
// on all int* poitners to avoid memory leaks.
//
//
// NOW IF you replace std::unique() with ptgi::unique(), all of the the problems disappear.
// Why? Because ptgi:unique merely reshuffles the data:
// OUTPUT: {1, 2, 3, 2}
// The ptgi:unique has swapped the last two elements, so all of the original elements in
// the INPUT are STILL in the OUTPUT.
//
// 130215   dbednar@ptgi.com
//============================================================================

#include <iostream>
#include <vector>
#include <algorithm>
#include <functional>

#include "ptgi_unique.hpp"

// functor used by std::unique to remove adjacent elts from vector<int*>
struct EqualToVectorOfIntegerStar: public std::equal_to<int *>
{
    bool operator() (const int* arg1, const int* arg2) const
    {
        return (*arg1 == *arg2);
    }
};

void printVector( const std::string& msg, const std::vector<int*>& vnums);

int main()
{
    int inums [] = { 1, 2, 2, 3 };
    std::vector<int*> vnums;

    // convert C array into vector of pointers to integers
    for (size_t inx = 0; inx < 4; ++ inx)
        vnums.push_back( new int(inums[inx]) );

    printVector("BEFORE UNIQ", vnums);

    // INPUT : 1, 2A, 2B, 3
    std::unique( vnums.begin(), vnums.end(), EqualToVectorOfIntegerStar() );
    // OUTPUT: 1, 2A, 3, 3 }
    printVector("AFTER  UNIQ", vnums);

    // now we delete 3 twice, and we have a memory leak because 2B is not deleted.
    for (size_t inx = 0; inx < vnums.size(); ++inx)
    {
        delete(vnums[inx]);
    }
}

// print a line of the form "msg: 1,2,3,..,5,6,7\n", where 1..7 are the numbers in vnums vector
// PS: you may pass "hello world" (const char *) because of implicit (automatic) conversion
// from "const char *" to std::string conversion.

void printVector( const std::string& msg, const std::vector<int*>& vnums)
{
    std::cout << msg << ": ";

    for (size_t inx = 0; inx < vnums.size(); ++inx)
    {
        // insert comma separator before current elt, but ONLY after first elt
        if (inx > 0)
            std::cout << ",";
        std::cout << *vnums[inx];

    }
    std::cout << "\n";
}

— joe
nguồn

PS: Tôi cũng đã chạy "valgrind ./Main10" và valgrind không tìm thấy vấn đề gì. Tôi thật sự khuyên tất cả các lập trình viên C ++ sử dụng LINUX nên sử dụng công cụ rất hiệu quả này, đặc biệt nếu bạn đang viết các ứng dụng thời gian thực phải chạy 24x7 và không bao giờ bị rò rỉ hoặc gặp sự cố!

— joe

Trọng tâm của vấn đề với std :: unique có thể được tóm tắt bằng câu lệnh này "std :: unique trả về các bản sao ở trạng thái không xác định" !!!!! Tại sao ủy ban tiêu chuẩn đã làm điều này, tôi sẽ không bao giờ biết. Thành viên của Comme .. có ý kiến gì không ???

— joe

1

Có, "std :: unique trả về các bản sao ở trạng thái không xác định". Vì vậy, chỉ cần không dựa vào một mảng đã được "duy nhất" để quản lý bộ nhớ theo cách thủ công! Cách đơn giản nhất để làm điều này là sử dụng std :: unique_ptr thay vì con trỏ thô.

— alexk7

Điều này dường như là một phản ứng cho một câu trả lời khác nhau; nó không trả lời câu hỏi (trong đó vectorchứa số nguyên, không phải con trỏ và không chỉ định bộ so sánh).

— Toby Speight

-2

void EraseVectorRepeats(vector <int> & v){ 
TOP:for(int y=0; y<v.size();++y){
        for(int z=0; z<v.size();++z){
            if(y==z){ //This if statement makes sure the number that it is on is not erased-just skipped-in order to keep only one copy of a repeated number
                continue;}
            if(v[y]==v[z]){
                v.erase(v.begin()+z); //whenever a number is erased the function goes back to start of the first loop because the size of the vector changes
            goto TOP;}}}}

Đây là một chức năng mà tôi đã tạo mà bạn có thể sử dụng để xóa lặp lại. Các tập tin tiêu đề cần thiết là chỉ <iostream>và <vector>.

— GrabeS
nguồn